<ul data-eligibleForWebStory="false"><li>This work focuses on data-efficient exploration in reinforcement learning by examining information-theoretic approaches to intrinsic motivation.</li><li>Exploration bonuses targeting epistemic uncertainty are studied, showing that they signal information gains and converge to zero as the agent learns about the environment.</li><li>The analysis provides formal guarantees for these approaches and discusses practical approximations through different models like sparse variational Gaussian Processes and Deep Ensemble models.</li><li>The framework called Predictive Trajectory Sampling with Bayesian Exploration (PTS-BE) is introduced, combining model-based planning with information-theoretic bonuses to achieve sample-efficient deep exploration, outperforming other baselines in various environments in the empirical evaluation.</li></ul>

On Efficient Bayesian Exploration in Model-Based Reinforcement Learning

Discover more