When an AI agent loses context mid-task because traditional storage can't keep pace with inference, it is not a model problem — it is a storage problem. At GTC 2026, Nvidia announced BlueField-4 STX, a modular reference architecture that inserts a dedicated context memory layer between GPUs and traditional storage, claiming 5x the token throughput, 4x the energy efficiency and 2x the data ingestion speed of conventional CPU-based storage.The bottleneck STX targets is key-value cache data. KV cache is the stored record of what a model has already processed — the intermediate calculations an LLM saves so it does not have to recompute attention across the entire context on every inference step. It is what allows an agent to maintain coherent working memory across sessions, tool calls [...]
Jensen Huang walked onto the GTC stage Monday wearing his trademark leather jacket and carrying, as it turned out, the blueprints for a new kind of monopoly.The Nvidia CEO unveiled the Agent Toolkit, [...]
For the first time on a major AI platform release, security shipped at launch — not bolted on 18 months later. At Nvidia GTC this week, five security vendors announced protection for Nvidia's a [...]
Nvidia on Monday took the wraps off Vera Rubin, a sweeping new computing platform built from seven chips now in full production — and backed by an extraordinary lineup of customers that includes Ant [...]
Ford has unveiled a new F-150 Lightning variant called the STX that brings extra range and a rugged attitude to the lineup. The model is likely a response to slipping F-150 Lightning sales and was des [...]
For all their superhuman power, today’s AI models suffer from a surprisingly human flaw: They forget. Give an AI assistant a sprawling conversation, a multi-step reasoning task or a project spanning [...]
Processing 200,000 tokens through a large language model is expensive and slow: the longer the context, the faster the costs spiral. Researchers at Tsinghua University and Z.ai have built a technique [...]
Nvidia on Monday unveiled a deskside supercomputer powerful enough to run AI models with up to one trillion parameters — roughly the scale of GPT-4 — without touching the cloud. The machine, calle [...]
When an enterprise LLM retrieves a product name, technical specification, or standard contract clause, it's using expensive GPU computation designed for complex reasoning — just to access stati [...]