2025-10-20
Researchers at Mila have proposed a new technique that makes large language models (LLMs) vastly more efficient when performing complex reasoning. Called Markovian Thinking, the approach allows LLMs to engage in lengthy reasoning without incurring the prohibitive computational costs that currently limit such tasks.
The team’s implementation, an environment named Delethink, structures the reasoning chain into fixed-size chunks, breaking the scaling problem that plagues very long LLM responses. Initial estimates show that for a 1.5B parameter model, this method can cut the costs of training by more than two-thirds compared to standard approaches.
For an LLM to solve a co [...]
2025-10-09
Researchers at Nvidia have developed a new technique that flips the script on how large language models (LLMs) learn to reason. The method, called reinforcement learning pre-training (RLP), integrates [...]
2025-09-29
DeepSeek continues to push the frontier of generative AI...in this case, in terms of affordability.The company has unveiled its latest experimental large language model (LLM), DeepSeek-V3.2-Exp, that [...]
2025-10-02
IBM today announced the release of Granite 4.0, the newest generation of its homemade family of open source large language models (LLMs) designed to balance high performance with lower memory and cost [...]
2025-10-08
The trend of AI researchers developing new, small open source generative models that outperform far larger, proprietary peers continued this week with yet another staggering advancement.Alexia Jolicoe [...]
2025-10-21
DeepSeek, the Chinese artificial intelligence research company that has repeatedly challenged assumptions about AI development costs, has released a new model that fundamentally reimagines how large l [...]
2025-09-30
Meta’s AI research team has released a new large language model (LLM) for coding that enhances code understanding by learning not only what code looks like, but also what it does when executed. The [...]
2024-11-08
Non-fungible tokens (NFTs) are the most popular digital assets today, capturing the attention of cryptocurrency investors, whales and people from around the world. People find it amazing that some u [...]
2025-10-08
The latest addition to the small model wave for enterprises comes from AI21 Labs, which is betting that bringing models to devices will free up traffic in data centers. AI21’s Jamba Reasoning 3B, a [...]
2025-07-12
Microsoft has introduced Phi-4-mini-flash-reasoning, a lightweight AI model built for scenarios with tight computing, memory, or latency limits. Designed for edge devices and mobile apps, the model ai [...]