menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

R2E-Gym: P...
source image

Arxiv

1w

read

345

img
dot

Image Credit: Arxiv

R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents

  • Improving open-source models on real-world SWE tasks (solving GITHUB issues) faces challenges in scalable curation of execution environments and optimal test-time compute scaling.
  • AgentGym is introduced as the largest procedurally-curated executable gym environment for training real-world SWE-agents, with over 8.7K tasks.
  • SYNGEN, a synthetic data curation recipe, is used to enable scalable curation of executable environments, leading to improved training performance.
  • Hybrid Test-time Scaling is employed, showcasing the complementary strengths and limitations of execution-based and execution-free verifiers.

Read Full Article

like

20 Likes

For uninterrupted reading, download the app