zdnet

2025-10-07

Anthropic's open-source safety tool found AI models whisteblowing - in all the wrong places

The Petri tool found AI "may be influenced by narrative patterns more than by a coherent drive to minimize harm." Here's how the most deceptive models ranked. [...]

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat

2025-10-02

'Western Qwen': IBM wows with Granite 4 LLM launch and hybrid Mamba/Transformer architecture

IBM today announced the release of Granite 4.0, the newest generation of its homemade family of open source large language models (LLMs) designed to balance high performance with lower memory and cost [...]

Match Score: 99.28

Destination

2025-02-10

Roblox, Discord, OpenAI and Google found new child safety group

Roblox, Discord, OpenAI and Google are launching a nonprofit organization called ROOST, or Robust Open Online Safety Tools, which hopes "to build scalable, interoperable safety infrastructure su [...]

Match Score: 84.89

Destination

2025-08-05

OpenAI's first new open-weight LLMs in six years are here

For the first time since GPT-2 in 2019, OpenAI is releasing new open-weight large language models. It's a major milestone for a company that has increasingly been accused of forgoing its original [...]

Match Score: 84.33

Destination

2025-07-30

Is Mark Zuckerberg flip flopping on open source AI?

Earlier today, Mark Zuckerberg shared a rambling memo outlining his vision to build AI "superintelligence." In the memo, Zuckerberg hinted that the pursuit of more powerful AI might require [...]

Match Score: 80.57

Destination

2025-08-27

OpenAI and Anthropic conducted safety evaluations of each other's AI systems

Most of the time, AI companies are locked in a race to the top, treating each other as rivals and competitors. Today, OpenAI and Anthropic revealed that they agreed to evaluate the alignment of each o [...]

Match Score: 78.21

Destination

2025-09-29

Claude Sonnet 4.5 is Anthropic's safest AI model yet

In May, Anthropic announced two new AI systems, Opus 4 and Sonnet 4. Now, less than six months later, the company is introducing Sonnet 4.5, and calling it the best coding model in the world to date. [...]

Match Score: 66.69

Destination

2025-10-07

Anthropic launches Petri, an open-source tool for automated AI model safety audits

Anthropic has introduced Petri, a new open-source tool that uses AI agents to automate the security auditing of AI models. In initial tests with 14 leading models, Petri uncovered problematic behavior [...]

Match Score: 58.21

Destination

2025-09-09

Microsoft reportedly plans to start using Anthropic models to power some of Office 365's Copilot features

Microsoft reportedly plans to begin using Anthropic's latest Claude models to power some of the Copilot features in its Office 365 apps. In a report published Tuesday, The Information said the te [...]

Match Score: 56.55

Destination

2025-01-22

Google is investing another billion dollars in Anthropic

Google has decided to invest another billion into Anthropic, four sources told the Financial Times, bringing its total sunk cost to more than three billion dollars. Both companies have declined to com [...]

Match Score: 55.99