The second new model that Microsoft released today, Phi-4-multimodal, is an upgraded version of Phi-4-mini with 5.6 billion parameters. It can process not only text but also images, audio and video.
Microsoft reports that the Phi-4-multimodal outperforms competitors, including Google's Gemini 2.0 Flash, in specific benchmarks. On the other hand, the Phi-4-mini model, featuring 3.8 billion ...
All may not be well between Microsoft and OpenAI. A new report suggests that Microsoft is building its own AI model to rival ...
Google has released four new open-source AI models under the Gemma 3 series, which are tailor-made for deploying on mobile ...
For comparison, the context window for Google’s Gemini 2.0 Flash Lite model stands at a million ... a similar strategy with its open-source Phi series of small language models.
Google introduced Gemma 3, its third-generation open-source AI model designed to operate efficiently on both smartphones and ...
Microsoft is expanding its Phi line of open-source language models with two new algorithms optimized for multimodal ...
The Phi-4 multimodal model supports applications including document analysis and speech recognition. On multimodal audio and visual benchmarks, it surpasses Google Gemini 2 Flash and Gemini 1.5 Pro.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results