Apple Intelligence's on-device transcription is more than twice as fast as OpenAI's Whisper model, revolutionizing speech-to-text performance.
Apple's new speech APIs leverage optimized on-device processing to achieve lightning-fast transcription, setting a new benchmark for efficiency and responsiveness in voice-driven applications.
The speed of Apple's transcription technology is described as a game changer for users relying on voice transcription for creating text from various sources like lectures, podcasts, and videos.
OpenAI's Whisper model, often used for speech-to-text, is criticized for its slow speed compared to Apple Intelligence's on-device transcription.
A tested command line tool built by Finn using Apple's SpeechAnalyzer and SpeechTranscriber for transcribing video and audio files quickly yielded impressive results.
Running a 7GB 4K video through Yap, an Apple tool, took only 45 seconds to generate an SRT file, showcasing the efficiency of Apple's transcription capabilities.
Apple's new transcription model is expected to replace Whisper as the default option for transcription apps on Apple platforms.
MacDailyNews notes that Apple's transcription speed being more than twice as fast while maintaining accuracy is impressive.