NVIDIA has released Llama Nemotron Nano 4B, an open-source reasoning model optimized for edge AI and scientific tasks.
The model, with 4 billion parameters, offers higher accuracy and up to 50% greater throughput compared to models with 8 billion parameters, as per internal benchmarks.
Nemotron Nano 4B is designed for inference efficiency in compact models, catering to hybrid reasoning and instruction-following tasks outside conventional cloud environments.
The model excels in single-turn and multi-turn reasoning tasks, provides higher throughput, and supports context windows up to 128,000 tokens, suitable for various complex workloads.