Destination
Microsoft introduces Phi-4-mini-flash-reasoning with up to 10x higher token throughput

Microsoft has introduced Phi-4-mini-flash-reasoning, a lightweight AI model built for scenarios with tight computing, memory, or latency limits. Designed for edge devices and mobile apps, the model aims to deliver strong reasoning abilities without demanding hardware.<br /> The article Microsoft introduces Phi-4-mini-flash-reasoning with up to 10x higher token throughput appeared first on THE DECODER. [...]

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat
Microsoft built Phi-4-reasoning-vision-15B to know when to think — and when thinking is a waste of time

Microsoft on Tuesday released Phi-4-reasoning-vision-15B, a compact open-weight multimodal AI model that the company says matches or exceeds the performance of systems many times its size — while co [...]

Match Score: 616.62

venturebeat
Phi-4 proves that a 'data-first' SFT methodology is the new differentiator

AI engineers often chase performance by scaling up LLM parameters and data, but the trend toward smaller, more efficient, and better-focused models has accelerated. The Phi-4 fine-tuning methodology [...]

Match Score: 535.20

venturebeat
Google releases Gemini 3.1 Flash Lite at 1/8th the cost of Pro

Google's newest AI model is here: Gemini 3.1 Flash-Lite, and the biggest improvements this time around come in cost and speed, especially for enterprises and developers seeking to leverage powerf [...]

Match Score: 200.77

venturebeat
Gemini 3 Flash arrives with reduced costs and latency — a powerful combo for enterprises

Enterprises can now harness the power of a large language model that's near that of the state-of-the-art Google’s Gemini 3 Pro, but at a fraction of the cost and with increased speed, thanks to [...]

Match Score: 162.13

venturebeat
Google says Gemini 3.5 Flash can slash enterprise AI costs by more than $1 billion a year

Google unveiled Gemini 3.5 Flash at its annual I/O developer conference on Tuesday, a new artificial intelligence model that the company says shatters what had become a seemingly iron law of the AI in [...]

Match Score: 143.18

venturebeat
How DeepSeek’s radical architecture is shattering Silicon Valley's token moat

DeepSeek’s announcement over the weekend that it has made its 75% price cut permanent on its flagship V4 Pro model is a disruptive assault on the capital-heavy business models of Silicon Valley’s [...]

Match Score: 126.09

venturebeat
Microsoft AI chief says company was “set free” from OpenAI to pursue superintelligence

For three years, Microsoft's artificial intelligence story has been inseparable from OpenAI. The partnership — cemented by a cumulative investment exceeding $13 billion — gave Microsoft early [...]

Match Score: 113.47

venturebeat
Researchers baked 3x inference speedups directly into LLM weights — without speculative decoding

As agentic AI workflows multiply the cost and latency of long reasoning chains, a team from the University of Maryland, Lawrence Livermore National Labs, Columbia University and TogetherAI has found a [...]

Match Score: 112.13

Destination
Microsoft expands its SLM lineup with new multimodal and mini Phi-4 models

Microsoft has added two new models to its Phi small language model family: Phi-4-multimodal, which can handle audio, images and text simultaneously, and Phi-4-mini, a streamlined model focused on text [...]

Match Score: 108.33