Enterprises can look forward to new capabilities — and strategic decisions — around the crucial task of creating a solid foundation for AI expansion in 2025. New chips, accelerators, co-processors, servers and other networking and storage hardware specially designed for AI promise to ease current shortages and deliver higher performance, expand service variety and availability, […] [...]
The standard guidelines for building large language models (LLMs) optimize only for training costs and ignore inference costs. This poses a challenge for real-world applications that use inference-tim [...]
Researchers at the University of Illinois Urbana-Champaign and Google Cloud AI Research have developed a framework that enables large language model (LLM) agents to organize their experiences into a m [...]
AI engineers often chase performance by scaling up LLM parameters and data, but the trend toward smaller, more efficient, and better-focused models has accelerated. The Phi-4 fine-tuning methodology [...]
Nvidia on Monday took the wraps off Vera Rubin, a sweeping new computing platform built from seven chips now in full production — and backed by an extraordinary lineup of customers that includes Ant [...]
Two days after releasing what analysts call the most powerful open-source AI model ever created, researchers from China's Moonshot AI logged onto Reddit to face a restless audience. The Beijing-b [...]
In a new paper that studies tool-use in large language model (LLM) agents, researchers at Google and UC Santa Barbara have developed a framework that enables agents to make more efficient use of tool [...]