The latest addition to the small model wave for enterprises comes from AI21 Labs, which is betting that bringing models to devices will free up traffic in data centers. AI21’s Jamba Reasoning 3B, a “tiny” open-source model that can run extended reasoning, code generation and respond based on ground truth. Jamba Reasoning 3B handles more than 250,000 tokens and can run inference on edge devices. The company said Jamba Reasoning 3B works on devices such as laptops and mobile phones. Ori Goshen, co-CEO of AI21, told VentureBeat that the company sees more enterprise use cases for small models, mainly because moving most inference to devices frees up data centers. “What we're seeing right now in the industry is an economics issue where there are very expensive data center bui [...]
Microsoft on Tuesday released Phi-4-reasoning-vision-15B, a compact open-weight multimodal AI model that the company says matches or exceeds the performance of systems many times its size — while co [...]
For the last two years, the prevailing logic in generative AI has been one of brute force: if you want better reasoning, you need a bigger model. While "small" models (under 10 billion param [...]
IBM today announced the release of Granite 4.0, the newest generation of its homemade family of open source large language models (LLMs) designed to balance high performance with lower memory and cost [...]
AI engineers often chase performance by scaling up LLM parameters and data, but the trend toward smaller, more efficient, and better-focused models has accelerated. The Phi-4 fine-tuning methodology [...]
A new framework from Stanford University and SambaNova addresses a critical challenge in building robust AI agents: context engineering. Called Agentic Context Engineering (ACE), the framework automat [...]
Nvidia launched the new version of its frontier models, Nemotron 3, by leaning in on a model architecture that the world’s most valuable company said offers more accuracy and reliability for agents. [...]
Deploying AI agents for repository-scale tasks like bug detection, patch verification, and code review requires overcoming significant technical hurdles. One major bottleneck: the need to set up dynam [...]
Researchers at MiroMind AI and several Chinese universities have released OpenMMReasoner, a new training framework that improves the capabilities of language models in multimodal reasoning.The framewo [...]
For all their superhuman power, today’s AI models suffer from a surprisingly human flaw: They forget. Give an AI assistant a sprawling conversation, a multi-step reasoning task or a project spanning [...]