2025-08-21
Deepseek is releasing Deepseek-V3.1, its first hybrid AI model with two operating modes. The company calls the new model its "first step toward the agent era," signaling a focus on building models with stronger agent skills.
The article Deepseek's first hybrid model V3.1 outperforms its R1 reasoning model on benchmarks appeared first on THE DECOD [...]
2025-10-21
DeepSeek, the Chinese artificial intelligence research company that has repeatedly challenged assumptions about AI development costs, has released a new model that fundamentally reimagines how large l [...]
2025-09-29
DeepSeek continues to push the frontier of generative AI...in this case, in terms of affordability.The company has unveiled its latest experimental large language model (LLM), DeepSeek-V3.2-Exp, that [...]
2025-10-02
IBM today announced the release of Granite 4.0, the newest generation of its homemade family of open source large language models (LLMs) designed to balance high performance with lower memory and cost [...]
2025-10-08
The trend of AI researchers developing new, small open source generative models that outperform far larger, proprietary peers continued this week with yet another staggering advancement.Alexia Jolicoe [...]
2025-02-06
Two US Congress members plan to introduce bipartisan legislation to ban China’s DeepSeek AI chatbot from government devices. The bill’s announcement came after a security expert said DeepSeek not [...]
2025-01-27
Chinese AI assistant DeepSeek has become the top rated free app on Apple's App Store in the US and elsewhere, beating out ChatGPT and other rivals. It's powered by the open-source DeepSeek V [...]
2025-10-09
Researchers at Nvidia have developed a new technique that flips the script on how large language models (LLMs) learn to reason. The method, called reinforcement learning pre-training (RLP), integrates [...]
2025-10-20
Researchers at Mila have proposed a new technique that makes large language models (LLMs) vastly more efficient when performing complex reasoning. Called Markovian Thinking, the approach allows LLMs t [...]
2025-10-08
The latest addition to the small model wave for enterprises comes from AI21 Labs, which is betting that bringing models to devices will free up traffic in data centers. AI21’s Jamba Reasoning 3B, a [...]