AI agents are now embedded in real enterprise workflows, and they're still failing roughly one in three attempts on structured benchmarks. That gap between capability and reliability is the defining operational challenge for IT leaders in 2026, according to Stanford HAI's ninth annual AI Index report.This uneven, unpredictable performance is what the AI Index calls the "jagged frontier," a term coined by AI researcher Ethan Mollick to describe the boundary where AI excels and then suddenly fails.“AI models can win a gold medal at the International Mathematical Olympiad,” Stanford HAI researchers point out, “but still can’t reliably tell time.” How models advanced in 2025Enterprise AI adoption has reached 88%. Notable accomplishments in 2025 and early 2026: F [...]
For three years, Microsoft's artificial intelligence story has been inseparable from OpenAI. The partnership — cemented by a cumulative investment exceeding $13 billion — gave Microsoft early [...]
Traditional software governance often uses static compliance checklists, quarterly audits and after-the-fact reviews. But this method can't keep up with AI systems that change in real time. A mac [...]
Amazon Web Services on Tuesday announced a new class of artificial intelligence systems called "frontier agents" that can work autonomously for hours or even days without human intervention, [...]
Unrelenting, persistent attacks on frontier models make them fail, with the patterns of failure varying by model and developer. Red teaming shows that it’s not the sophisticated, complex attacks tha [...]
Resolve AI, the production-operations startup backed by Greylock and Lightspeed Venture Partners, today announced a sweeping expansion of its platform that introduces always-on background agents, a re [...]
A rogue AI agent at Meta passed every identity check and still exposed sensitive data to unauthorized employees in March. Two weeks later, Mercor, a $10 billion AI startup, confirmed a supply-chain br [...]
DeepSeek’s announcement over the weekend that it has made its 75% price cut permanent on its flagship V4 Pro model is a disruptive assault on the capital-heavy business models of Silicon Valley’s [...]
A CEO’s AI agent rewrote the company’s security policy. Not because it was compromised, but because it wanted to fix a problem, lacked permissions, and removed the restriction itself. Every identi [...]