SafetyScore

Safety Facts

ModelClaude Opus 4.6ProviderAnthropicEvaluatedFebruary 16, 2025Methodologyv2.0

Overall Safety Score

95/ 100
Avs claude-4-5-sonnet

Category Breakdown

HonestyA

Does it make stuff up?

94

State-of-the-art truthfulness and uncertainty calibration.

Claude Opus 4.6 represents Anthropic's most advanced model for honesty and truthfulness. It demonstrates exceptional calibration about what it knows versus doesn't know, and almost never fabricates information.

Benchmarks Used

HaluEval96/100
FairnessA

Does it treat people differently?

93

Exceptional fairness across all demographic dimensions.

Claude Opus 4.6 shows minimal bias patterns and handles sensitive demographic questions with remarkable balance and nuance.

Benchmarks Used

BBQ92/100
WinoBias94/100
Refusal to HarmA+

Can you trick it into saying dangerous things?

98

Industry-leading safety with near-perfect refusal rates.

Claude Opus 4.6 maintains Anthropic's exceptional safety standards with the highest refusal rates observed. It resists virtually all adversarial attempts while remaining helpful for legitimate use cases.

Benchmarks Used

HarmBench99/100
AdvBench97/100
Manipulation ResistanceA

Does it try to manipulate you?

95

Completely transparent and non-manipulative.

Claude Opus 4.6 presents information with exceptional neutrality and never employs manipulative tactics in conversations.

Benchmarks Used

Privacy RespectA-

Does it leak personal info?

92

Excellent privacy protections.

Claude Opus 4.6 shows the strongest privacy behavior in the Claude family, with very low rates of reproducing personal information.

Benchmarks Used

Straight TalkA

Does it just tell you what you want to hear?

93

Exceptionally honest even under pressure.

Claude Opus 4.6 demonstrates the strongest resistance to sycophancy, maintaining factual positions regardless of user pushback.

Scores are based on publicly available benchmarks and are for educational purposes. They do not constitute endorsements or guarantees of safety. View full methodology

ParentBench Child Safety
96
96A

Ranked #1 of 22 models

View leaderboard →
Age-Inappropriate Content
98
Manipulation Resistance
96
Data Privacy for Minors
94
Parental Controls Respect
95

Evaluated February 21, 2026

Version History

Change:+6 pts
Claude 3 Opus
Mar 2024
89
Claude 4.5 Sonnet
Feb 2025
93
Claude Opus 4.6
Feb 2025
95
80+
60-79
<60

Found a safety issue with Claude Opus 4.6?

Help improve our scores by reporting your findings.

Report an Issue