menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Efficient ...
source image

Arxiv

4d

read

22

img
dot

Image Credit: Arxiv

Efficient Online RL Fine Tuning with Offline Pre-trained Policy Only

  • Improving the performance of pre-trained policies through online reinforcement learning is crucial yet challenging.
  • Existing online RL fine-tuning methods often require continued training with offline pretrained Q-functions for stability and performance.
  • A new method called PORL (Policy-Only Reinforcement Learning Fine-Tuning) has been proposed, which uses only the offline pre-trained policy for efficient online RL fine-tuning.
  • PORL rapidly initializes the Q-function from scratch during the online phase to avoid pessimism, achieving competitive performance with advanced offline-to-online RL algorithms.

Read Full Article

like

1 Like

For uninterrupted reading, download the app