This paper introduces a risk-aware safe reinforcement learning control design for stochastic discrete-time linear systems.
It combines a learned safe controller with the reinforcement learning controller to ensure high-confidence safety certification without a high-fidelity system model.
The approach helps avoid myopic interventions and convergence to undesired equilibriums by optimizing over a scalar decision variable and linear programming polyhedral sets.
Piecewise affine controllers are used to learn safe controllers with a large invariant set, leading to efficient solutions with reduced data requirements.