ARC Prize has announced the ARC-AGI-2 benchmark to evaluate AI models' human-like intelligence.The benchmark poses greater challenges by factoring in efficiency and performance.Non-reasoning models (Pure LLMs) scored 0%, while human participants achieved a perfect score of 100%.OpenAI's o3 reasoning model received the highest score of 4.0%, but will not be released as a standalone model.