menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

CoRT: Code...
source image

Arxiv

3d

read

282

img
dot

Image Credit: Arxiv

CoRT: Code-integrated Reasoning within Thinking

  • Large Reasoning Models (LRMs) like o1 and DeepSeek-R1 have made progress in natural language reasoning with long chain-of-thought (CoT) but struggle with complex mathematical operations.
  • Code Interpreter (CI) introduces external knowledge to LRMs, but combining it directly poses challenges.
  • CoRT is a post-training framework designed to teach LRMs to effectively use CI for complex mathematical operations.
  • Data scarcity is addressed by synthesizing code-integrated reasoning data through Hint-Engineering, strategically inserting hints to optimize LRM-CI interaction.
  • 30 high-quality samples are manually created to post-train models ranging from 1.5B to 32B parameters using supervised fine-tuning, rejection fine-tuning, and reinforcement learning.
  • Hint-Engineering models show 4% and 8% absolute improvements on DeepSeek-R1-Distill-Qwen-32B and DeepSeek-R1-Distill-Qwen-1.5B respectively across five challenging mathematical reasoning datasets.
  • Hint-Engineering models use about 30% fewer tokens for the 32B model and 50% fewer tokens for the 1.5B model compared to natural language models.
  • Experimental results demonstrate the effectiveness of CoRT in improving LRMs' performance on mathematical reasoning tasks.
  • The models and code for CoRT are available at https://github.com/ChengpengLi1003/CoRT.

Read Full Article

like

17 Likes

For uninterrupted reading, download the app