menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Informatio...
source image

Arxiv

1w

read

8

img
dot

Image Credit: Arxiv

Information-Theoretic Reward Decomposition for Generalizable RLHF

  • A new approach for generalizable reward model in Reinforcement Learning from Human Feedback (RLHF) is proposed.
  • Existing reward models lack the ability to correctly evaluate unseen prompt-response pairs.
  • The proposed approach decomposes the reward value into prompt-free reward and prompt-related reward.
  • The new reward learning algorithm prioritizes data samples based on their prompt-free reward values.

Read Full Article

like

Like

For uninterrupted reading, download the app