<ul data-eligibleForWebStory="true"><li>Prioritized experience replay is a crucial component of value-based deep reinforcement learning models.</li><li>Transitions are typically prioritized based on temporal difference error, but this can favor noisy transitions.</li><li>Using epistemic uncertainty estimation is proposed to guide transition prioritization from the replay buffer.</li><li>Epistemic uncertainty quantifies uncertainty that can be reduced by learning, reducing sampled unpredictable transitions.</li><li>Benefits of epistemic uncertainty prioritized replay are illustrated in tabular toy models and evaluated on the Atari suite.</li><li>The approach outperformed quantile regression deep Q-learning benchmarks.</li><li>This method paves the way for uncertainty prioritized replay in reinforcement learning agents.</li></ul>

Uncertainty Prioritized Experience Replay

Discover more