menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Policy Gra...
source image

Arxiv

1w

read

37

img
dot

Image Credit: Arxiv

Policy Gradient Converges to the Globally Optimal Policy for Nearly Linear-Quadratic Regulators

  • Nonlinear control systems with partial information to the decision maker are prevalent in a variety of applications.
  • This work explores reinforcement learning methods for finding the optimal policy in the nearly linear-quadratic regulator systems.
  • The cost function of the system is nonconvex, but the study establishes local strong convexity and smoothness in the vicinity of the global optimizer.
  • A policy gradient algorithm is proposed that is guaranteed to converge to the globally optimal policy with a linear rate.

Read Full Article

like

2 Likes

For uninterrupted reading, download the app