menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

SPoRt -- S...
source image

Arxiv

1w

read

328

img
dot

Image Credit: Arxiv

SPoRt -- Safe Policy Ratio: Certified Training and Deployment of Task Policies in Model-Free RL

  • To apply reinforcement learning to safety-critical applications, safety guarantees during policy training and deployment are necessary.
  • The paper presents the concept of Safe Policy Ratio (SPoRt) to provide a bound on the probability of violating a safety property in a model-free, episodic setup.
  • SPoRt includes Projected PPO, a new approach for training task-specific policies while maintaining a user-specified bound on property violation.
  • The experimental results demonstrate the trade-off between safety guarantees and task-specific performance in SPoRt.

Read Full Article

like

19 Likes

For uninterrupted reading, download the app