menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

A Provable...
source image

Arxiv

3d

read

171

img
dot

Image Credit: Arxiv

A Provable Approach for End-to-End Safe Reinforcement Learning

  • A new method called Provably Lifetime Safe RL (PLS) has been proposed for safe reinforcement learning (RL).
  • PLS integrates offline safe RL with safe policy deployment to ensure the safety of a policy from learning to operation.
  • The method learns a policy offline using return-conditioned supervised learning and optimizes target returns using Gaussian processes (GPs) during deployment.
  • Empirical results show that PLS outperforms baselines in safety and reward performance, achieving the goal of high rewards while maintaining policy safety.

Read Full Article

like

10 Likes

For uninterrupted reading, download the app