Skywork-OR1 is a reinforcement learning (RL) implementation designed to enhance the reasoning capabilities of large language models (LLMs).
The RL approach in Skywork-OR1 improves the average accuracy across AIME24, AIME25, and LiveCodeBench benchmarks significantly.
Skywork-OR1-32B model outperforms DeepSeek-R1 and Qwen3 on AIME24 and AIME25 benchmarks.
The study includes ablation studies on training components, analysis of entropy dynamics, and open-sourcing of model weights, training code, and datasets.