menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Bridging S...
source image

Arxiv

3d

read

115

img
dot

Image Credit: Arxiv

Bridging Supervised and Temporal Difference Learning with $Q$-Conditioned Maximization

  • Supervised learning (SL) and temporal difference (TD) learning methods are being combined to enhance reinforcement learning (RL) capabilities.
  • A new approach called Goal-Conditioned Reinforced Supervised Learning (GCReinSL) is introduced to improve SL methods with trajectory stitching capability.
  • This approach incorporates $Q$-conditioned policy and $Q$-conditioned maximization to bridge the performance gap between SL and TD learning in offline goal-conditioned RL.
  • Experimental results show that GCReinSL outperforms previous SL methods with trajectory stitching capabilities and goal data augmentation techniques.

Read Full Article

like

6 Likes

For uninterrupted reading, download the app