Tencent has released an open-source large language model called Hunyuan-A13B, based on a sparse MoE architecture with 13 billion active parameters.
Hunyuan-A13B features Grouped Query Attention, 256K context length, and dual-mode reasoning for fast and slow thinking to balance computational cost and performance.
The model achieves top-tier performance in agentic benchmarks, excelling in tool-calling and long-context scenarios, surpassing larger models in logical reasoning and agent tasks.
Hunyuan-A13B, optimized for efficient deployment, supports inference optimization formats and frameworks, making it suitable for real-time applications and long-context tasks.