menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Bourbaki: ...
source image

Arxiv

2d

read

70

img
dot

Image Credit: Arxiv

Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving

  • Large language models face challenges in automated theorem proving due to sparse rewards and complex reasoning tasks.
  • A new framework called self-generated goal-conditioned MDPs (sG-MDPs) is introduced to tackle these challenges by allowing agents to generate and pursue subgoals in a structured manner.
  • Monte Carlo Tree Search (MCTS)-like algorithms are utilized to solve the sG-MDP, implemented in Bourbaki (7B) system, which utilizes multiple LLMs for subgoal generation and tactic synthesis.
  • Bourbaki (7B) achieves state-of-the-art results on PutnamBench by solving 26 problems, demonstrating the effectiveness of the approach in theorem proving.

Read Full Article

like

4 Likes

For uninterrupted reading, download the app