Xiaomi Corp. has released MiMo-7B, a new reasoning model series with 7 billion parameters, outperforming OpenAI's o1-mini for some tasks. Xiaomi developed enhanced versions using supervised fine-tuning and reinforcement learning.
Prover, a reasoning model by DeepSeek, also received an update to Prover-V2, optimized for proving mathematical theorems. DeepSeek trained Prover-V2 using a multistep process involving existing proofs.
Xiaomi's MiMo-7B series includes a base model and three enhanced versions, one fine-tuned with supervised learning, another with reinforcement learning, and a third using both methods, surpassing OpenAI's o1-mini.
Alibaba recently introduced Qwen3, a family of models claiming to outperform OpenAI and DeepSeek models. These advancements reflect the competitive landscape of reasoning-optimized AI models.