Researchers propose a method for reinforcement learning that leverages prior data for guiding exploration instead of using explicit imitation learning objectives.
The approach, Data-Guided Noise (DGN), adds noise to the policy based on the prior demonstrations to improve sample efficiency.
DGN achieves significant improvements in reinforcement learning from offline data methods, showing 2-3x enhancement across seven simulated continuous control tasks.
This method aims to overcome the limitations of traditional imitation learning objectives and focuses on exploration guided by prior data for better long-term performance.