Research organization Epoch AI released FrontierMath, a new mathematics benchmark that challenges leading AI models.FrontierMath contains expert-level problems that AI models solve less than 2 percent of the time.Top AI models, including GPT-4o and Gemini 1.5 Pro, scored poorly on the FrontierMath benchmark.FrontierMath differs from other benchmarks by keeping its problem set private to prevent data contamination.