<ul><li>This technical report introduces Ring-Lite-Distill, a lightweight reasoning model derived from the Mixture-of-Experts (MoE) Large Language Models (LLMs) Ling-Lite.</li><li>The model demonstrates exceptional reasoning capabilities through high-quality data curation and training paradigms, maintaining a compact parameter-efficient architecture with 2.75 billion activated parameters.</li><li>The goal of the model is to achieve comprehensive competency coverage and preserve general capabilities, such as instruction following, tool use, and knowledge retention.</li><li>Ring-Lite-Distill's reasoning ability is comparable to DeepSeek-R1-Distill-Qwen-7B, with superior general capabilities.</li></ul>

Holistic Capability Preservation: Towards Compact Yet Comprehensive Reasoning Models

Discover more