Reinforcement learning from human feedback (RLHF) is crucial for aligning model behavior with user goals.
Current RLHF methods oversimplify human decision-making by focusing on isolated tasks like classification or regression.
A new reinforcement learning method presented in an arXiv paper considers multiple tasks to mimic human decision-making.
The proposed method leverages human ratings in reward-free settings to learn a reward function, striking a balance between classification and regression models.
This approach accounts for the uncertainty in human decision-making and allows for adaptive strategy emphasis.
Experiments with synthetic human ratings demonstrate the superior performance of the new method over existing rating-based RL techniques.
The novel method even outperforms traditional RL approaches in certain scenarios.