menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Adapting O...
source image

Arxiv

3d

read

338

img
dot

Image Credit: Arxiv

Adapting Offline Reinforcement Learning with Online Delays

  • Offline-to-online deployment of reinforcement-learning (RL) agents need to address the sim-to-real and interaction gaps.
  • A new framework called DT-CORL (Delay-Transformer belief policy Constrained Offline RL) is introduced to handle delayed dynamics during deployment.
  • DT-CORL produces delay-robust actions using a transformer-based belief predictor and is more sample-efficient compared to history-augmentation baselines.
  • Experiments on D4RL benchmarks demonstrate that DT-CORL outperforms other methods, bridging the sim-to-real latency gap while maintaining data efficiency.

Read Full Article

like

20 Likes

For uninterrupted reading, download the app