Safety

Anthropic publishes its constitution along with research about where the constitution works and where it does not. The current version is an ethical treatise addressing Claude discussing safety, ethics, Anthropic’s guidelines, and helpfulness, in that order when they conflict. Anthropic favours cultivating good values and judgment over strict rules. In 2026, Anthropic’s operational judgment failed twice in the same way, leading to the leak of the Claude Code source code.The constitution asks Claude to imagine how a “thoughtful senior Anthropic employee would react”, but what happens when the organisation’s structure fails? ...