Destination

2025-05-13

OpenAI says its latest models outperform doctors in medical benchmark


OpenAI has released a new benchmark for testing AI systems in healthcare. Called HealthBench, it's designed to evaluate how well language models handle realistic medical conversations. According to OpenAI, its latest models outperform doctors on the test.


The article OpenAI says its latest models outperform doctors in medical benchmark appeared first on THE DECODER.

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat

2025-12-01

OpenAGI emerges from stealth with an AI agent that it claims crushes OpenAI and Anthropic

A stealth artificial intelligence startup founded by an MIT researcher emerged this morning with an ambitious claim: its new AI model can control computers better than systems built by OpenAI and Anth [...]

Match Score: 79.55

venturebeat

2025-12-10

The 70% factuality ceiling: why Google’s new ‘FACTS’ benchmark is a wake-up call for enterprise AI

There's no shortage of generative AI benchmarks designed to measure the performance and accuracy of a given model on completing various helpful enterprise tasks — from coding to instruction fol [...]

Match Score: 74.53

venturebeat

2025-12-11

OpenAI's GPT-5.2 is here: what enterprises need to know

The rumors were true, and the "Code Red" is over: OpenAI today announced the release of its new frontier large language model (LLM) family: GPT-5.2.It comes at a pivotal moment for the AI pi [...]

Match Score: 68.35

venturebeat

2025-11-20

Grok 4.1 Fast's compelling dev access and Agent Tools API overshadowed by Musk glazing

Elon Musk's frontier generative AI startup xAI formally opened developer access to its Grok 4.1 Fast models last night and introduced a new Agent Tools API—but the technical milestones were imm [...]

Match Score: 64.95

venturebeat

2025-12-02

Mistral launches Mistral 3, a family of open models designed to run on laptops, drones, and edge devices

Mistral AI, Europe's most prominent artificial intelligence startup, is releasing its most ambitious product suite to date: a family of 10 open-source models designed to run everywhere from smart [...]

Match Score: 58.41

blogspot

2025-12-04

How I Get Free Traffic from ChatGPT in 2025 (AIO vs SEO)

Three weeks ago, I tested something that completely changed how I think about organic traffic. I opened ChatGPT and asked a simple question: "What's the best course on building SaaS with Wor [...]

Match Score: 58.32

venturebeat

2025-11-14

OpenAI experiment finds that sparse models could give AI builders the tools to debug neural networks

OpenAI researchers are experimenting with a new approach to designing neural networks, with the aim of making AI models easier to understand, debug, and govern. Sparse models can provide enterprises w [...]

Match Score: 57.02

venturebeat

2025-10-28

IBM's open source Granite 4.0 Nano AI models are small enough to run locally directly in your browser

In an industry where model size is often seen as a proxy for intelligence, IBM is charting a different course — one that values efficiency over enormity, and accessibility over abstraction.The 114-y [...]

Match Score: 56.45

venturebeat

2025-11-06

Moonshot's Kimi K2 Thinking emerges as leading open source AI, outperforming GPT-5, Claude Sonnet 4.5 on key benchmarks

Even as concern and skepticism grows over U.S. AI startup OpenAI's buildout strategy and high spending commitments, Chinese open source AI providers are escalating their competition and one has e [...]

Match Score: 55.98