<ul><li>The AlphaZero/MuZero (A/MZ) family of algorithms utilizes Monte Carlo Tree Search (MCTS) and learned models for remarkable success in various domains.</li><li>Epistemic MCTS (EMCTS) is introduced to address the uncertainty caused by learned models and enhance exploration in sparse reward environments.</li><li>When applied to the task of writing code in the Assembly language subleq, AZ with EMCTS achieves higher sample efficiency compared to the baseline AZ.</li><li>Search with EMCTS significantly outperforms equivalent methods without search for uncertainty estimation in solving hard-exploration benchmark Deep Sea, showcasing the benefits of search for epistemic uncertainty estimation.</li></ul>

Epistemic Monte Carlo Tree Search

Discover more