menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Stealing T...
source image

Arxiv

1w

read

271

img
dot

Image Credit: Arxiv

Stealing That Free Lunch: Exposing the Limits of Dyna-Style Reinforcement Learning

  • Dyna-style off-policy model-based reinforcement learning (DMBRL) algorithms are facing a performance gap when applied across different benchmark environments.
  • While DMBRL algorithms perform well in OpenAI Gym, their performance drops significantly in DeepMind Control Suite (DMC) with proprioceptive observations.
  • Modern techniques designed to address issues in these settings do not consistently improve performance across all environments.
  • Adding synthetic rollouts to the training process, which is the backbone of Dyna-style algorithms, significantly degrades performance in most DMC environments.

Read Full Article

like

16 Likes

For uninterrupted reading, download the app