menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

TROFI: Tra...
source image

Arxiv

23h

read

29

img
dot

Image Credit: Arxiv

TROFI: Trajectory-Ranked Offline Inverse Reinforcement Learning

  • TROFI is a new approach in offline reinforcement learning that aims to train agents without a predefined reward function.
  • It first learns a reward function from human preferences to label the dataset, enabling training of the policy.
  • Experiments on the D4RL benchmark show that TROFI outperforms baselines and performs similarly to using the ground truth reward.
  • The efficacy of TROFI is validated in a 3D game environment, emphasizing the importance of a well-engineered reward function in reinforcement learning.

Read Full Article

like

1 Like

For uninterrupted reading, download the app