<ul><li>Nvidia has launched the Parakeet-TDT-0.6B-v2, an open-source automatic speech recognition (ASR) model that can transcribe 60 minutes of audio in 1 second.</li><li>Parakeet-TDT-0.6B-v2 boasts 600 million parameters and offers a Word Error Rate (WER) of just 6.05%, competing with top proprietary transcription models.</li><li>The model is freely available under a Creative Commons CC-BY-4.0 license, supporting transcription services, voice assistants, and conversational AI platforms.</li><li>Trained on the Granary dataset, the model shows strong generalization performance, supports punctuation, capitalization, and can be deployed using Nvidia’s NeMo toolkit.</li></ul>

Nvidia launches fully open source transcription AI model Parakeet-TDT-0.6B-V2 on Hugging Face

Discover more