menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Policy Gra...
source image

Arxiv

2d

read

79

img
dot

Image Credit: Arxiv

Policy Gradient for LQR with Domain Randomization

  • Domain randomization (DR) enables sim-to-real transfer by training controllers on a distribution of simulated environments.
  • Simple policy gradient (PG) methods are often used to solve DR, but the theoretical guarantees are limited.
  • A convergence analysis of PG methods for domain-randomized linear quadratic regulation (LQR) is provided in this study.
  • The study shows that PG converges globally under suitable bounds on the heterogeneity of sampled systems, and proposes a discount-factor annealing algorithm.

Read Full Article

like

4 Likes

For uninterrupted reading, download the app