menu
techminis

A naukri.com initiative

google-web-stories
Home

>

Open Source News

>

AWS Introd...
source image

Marktechpost

3w

read

246

img
dot

AWS Introduces SWE-PolyBench: A New Open-Source Multilingual Benchmark for Evaluating AI Coding Agents

  • AWS AI Labs has introduced SWE-PolyBench, a multilingual, repository-level benchmark for evaluating AI coding agents.
  • SWE-PolyBench consists of 2,110 tasks across four programming languages - Java, JavaScript, TypeScript, and Python.
  • The benchmark incorporates real pull requests (PRs) and introduces Concrete Syntax Tree (CST)-based metrics for assessment.
  • The evaluation of agents on SWE-PolyBench demonstrates varying performance across languages and task types.

Read Full Article

like

14 Likes

For uninterrupted reading, download the app