Cybersecurity researchers aren’t happy about the guardrails on Anthropic’s Fable

TechCrunch
Cybersecurity experts are frustrated that Anthropic's new Fable model uses overly broad guardrails that block even innocuous tasks related to coding or research.

Summary

Anthropic recently launched Fable, a public version of its specialized cybersecurity model, Mythos. However, the model has faced significant criticism from industry experts who argue that its safety guardrails are excessively restrictive. Researchers report that the AI frequently blocks harmless requests, such as standard code reviews or reading security blogs, because it misidentifies them as potential threats related to malware or biology. While some experts acknowledge that these measures are an early-stage precaution to prevent misuse, they criticize the system for relying on broad keyword triggers that hinder legitimate cybersecurity and software engineering work.

(Source:TechCrunch)