menu
techminis

A naukri.com initiative

google-web-stories
Home

>

Data Science News

>

AI Models ...
source image

Analyticsindiamag

1w

read

388

img
dot

Image Credit: Analyticsindiamag

AI Models from Google, OpenAI, Anthropic Solve 0% of ‘Hard’ Coding Problems

  • A new research study reveals that AI models from Google, OpenAI, and Anthropic have not been able to solve any 'Hard' coding problems, scoring 0%.
  • The study by multiple universities identified shortcomings in existing coding benchmarks and introduced the LiveCodeBench Pro to evaluate models with challenging problems.
  • Models excelled in knowledge-heavy and logic-heavy problems but struggled with observation-heavy challenges that require novel insights.
  • The AI models often made errors related to algorithms, showing room for improvement even with multiple attempts.
  • Despite claims of surpassing elite humans, the models still lag significantly in tasks demanding unique solutions.
  • Another analysis by Oxford researcher Toby Ord suggests that AI agents have a declining success rate over longer tasks, posing a challenge for handling complex coding projects.
  • While AI agents show improvements in handling longer tasks, achieving high-reliability performance still requires significantly shorter task durations.
  • The timeline for AI to effectively manage intricate coding tasks remains uncertain despite advancements in AI capabilities.
  • A detailed technical report is available for in-depth information on the research findings.
  • The article delves into the struggles of AI models in tackling complex coding challenges, emphasizing the need for improvement in reasoning and problem-solving capabilities.

Read Full Article

like

23 Likes

For uninterrupted reading, download the app