venturebeat
Attention ISN'T all you need?! New Qwen3 variant Brumby-14B-Base leverages Power Retention technique

When the transformer architecture was introduced in 2017 in the now seminal Google paper "Attention Is All You Need," it became an instant cornerstone of modern artificial intelligence. Every major large language model (LLM) — from OpenAI's GPT series to Anthropic's Claude, Google's Gemini, and Meta's Llama — has been built on some variation of its central mechanism: attention, the mathematical operation that allows a model to look back across its entire input and decide what information matters most.Eight years later, the same mechanism that defined AI’s golden age is now showing its limits. Attention is powerful, but it is also expensive — its computational and memory costs scale quadratically with context length, creating an increasingly unsustainab [...]

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat
Nous Research's NousCoder-14B is an open-source coding model landing right in the Claude Code moment

Nous Research, the open-source artificial intelligence startup backed by crypto venture firm Paradigm, released a new competitive programming model on Monday that it says matches or exceeds several la [...]

Match Score: 239.94

venturebeat
Alibaba's Qwen 3.5 397B-A17 beats its larger trillion-parameter model — at a fraction of the cost

Alibaba dropped Qwen3.5 earlier this week, timed to coincide with the Lunar New Year, and the headline numbers alone are enough to make enterprise AI buyers stop and pay attention.The new flagship ope [...]

Match Score: 217.00

venturebeat
Alibaba's small, open source Qwen3.5-9B beats OpenAI's gpt-oss-120B and can run on standard laptops

Despite political turmoil in the U.S. AI sector, in China, the AI advances are continuing apace without a hitch.Earlier today, e-commerce giant Alibaba's Qwen Team of AI researchers, focused prim [...]

Match Score: 178.53

venturebeat
Alibaba's new open source Qwen3.5-Medium models offer Sonnet 4.5 performance on local computers

Alibaba's now famed Qwen AI development team has done it again: a little more than a day ago, they released the Qwen3.5 Medium Model series consisting of four new large language models (LLMs) wit [...]

Match Score: 153.87

venturebeat
Qwen3-Max Thinking beats Gemini 3 Pro and GPT-5.2 on Humanity's Last Exam (with search)

Chinese AI and tech firms continue to impress with their development of cutting-edge, state-of-the-art AI language models.Today, the one drawing eyeballs is Alibaba Cloud's Qwen Team of AI resear [...]

Match Score: 149.86

venturebeat
Qwen3-Coder-Next offers vibe coders a powerful open source, ultra-sparse model with 10x higher throughput for repo tasks

Chinese e-commerce giant Alibaba's Qwen team of AI researchers has emerged in the last year as one of the global leaders of open source AI development, releasing a host of powerful large language [...]

Match Score: 108.74

venturebeat
New KV cache compaction technique cuts LLM memory 50x without accuracy loss

Enterprise AI applications that handle large documents or long-horizon tasks face a severe memory bottleneck. As the context grows longer, so does the KV cache, the area where the model’s working me [...]

Match Score: 97.45

Destination
The best power banks and portable chargers for every device in 2025

On a recent work trip, I had plenty of things to worry about — but being able to recharge my two smartphones, laptop and iPad were not among my concerns. In my carry-on luggage, I had two medium-cap [...]

Match Score: 82.16

venturebeat
Phi-4 proves that a 'data-first' SFT methodology is the new differentiator

AI engineers often chase performance by scaling up LLM parameters and data, but the trend toward smaller, more efficient, and better-focused models has accelerated. The Phi-4 fine-tuning methodology [...]

Match Score: 78.14