The latest big headline in AI isn’t model size or multimodality — it’s the capacity crunch. At VentureBeat’s latest AI Impact stop in NYC, Val Bercovici, chief AI officer at WEKA, joined Matt Marshall, VentureBeat CEO, to discuss what it really takes to scale AI amid rising latency, cloud lock-in, and runaway costs.Those forces, Bercovici argued, are pushing AI toward its own version of surge pricing. Uber famously introduced surge pricing, bringing real-time market rates to ridesharing for the first time. Now, Bercovici argued, AI is headed toward the same economic reckoning — especially for inference — when the focus turns to profitability."We don't have real market rates today. We have subsidized rates. That’s been necessary to enable a lot of the innovation that [...]
For the past year, enterprise decision-makers have faced a rigid architectural trade-off in voice AI: adopt a "Native" speech-to-speech (S2S) model for speed and emotional fidelity, or stick [...]
On a recent work trip, I had plenty of things to worry about — but being able to recharge my two smartphones, laptop and iPad were not among my concerns. In my carry-on luggage, I had two medium-cap [...]
The AI updates aren't slowing down. Literally two days after OpenAI launched a new underlying AI model for ChatGPT called GPT-5.3 Instant, the company has unveiled another, even more massive upgr [...]
Enterprises can now harness the power of a large language model that's near that of the state-of-the-art Google’s Gemini 3 Pro, but at a fraction of the cost and with increased speed, thanks to [...]
Chinese AI startup Z.ai, known for its powerful, open source GLM family of large language models (LLMs), has introduced GLM-5-Turbo, a new, proprietary variant of its open source GLM-5 model aimed at [...]