<ul><li>Epoch AI has released FrontierMath, a mathematical benchmark designed to evaluate advanced reasoning capabilities in AI systems.</li><li>Current AI models can solve less than 2% of FrontierMath problems, indicating a substantial gap between AI capabilities and mathematical expertise.</li><li>FrontierMath problems are extremely challenging and require extended chains of precise reasoning in various mathematical domains.</li><li>While AI models are not yet on par with human mathematicians, benchmarks like FrontierMath provide opportunities for improvement in AI reasoning abilities.</li></ul>

A new math benchmark just dropped and leading AI models can solve 'less than 2%' of its problems... oh dear

Discover more