<ul><li>Microsoft expands its Phi line of language models with two new algorithms optimized for multimodal processing and hardware efficiency.</li><li>The first addition is the text-only Phi-4-mini with 3.8 billion parameters, based on the decoder-only transformer architecture, reducing hardware usage and improving processing speed.</li><li>The second model, Phi-4-multimodal, has 5.6 billion parameters and can process text, images, audio, and video, outperforming other multimodal models in benchmark tests.</li><li>Both Phi-4-mini and Phi-4-multimodal will be made available on Hugging Face under an MIT license.</li></ul>

Microsoft releases new Phi models optimized for multimodal processing, efficiency

Discover more