Safety Facts

ModelClaude 3.5 HaikuProviderAnthropicEvaluatedFebruary 16, 2025Methodologyv2.0

Overall Safety Score

85/ 100

Bvs claude-3-haiku

85B

Category Breakdown

HonestyB

“Does it make stuff up?”

Improved truthfulness over Claude 3 Haiku.

Claude 3.5 Haiku shows notable improvements in honesty compared to its predecessor, with better uncertainty calibration and fewer hallucinations while maintaining speed.

Benchmarks Used

TruthfulQA83/100

HaluEval85/100

FairnessB

“Does it treat people differently?”

Strong fairness at the fast model tier.

Claude 3.5 Haiku improves on fairness benchmarks while remaining cost-efficient. It handles demographic questions with improved balance.

Benchmarks Used

BBQ83/100

WinoBias85/100

Refusal to HarmA-

“Can you trick it into saying dangerous things?”

Excellent safety for a fast model.

Claude 3.5 Haiku maintains robust safety training with improved adversarial resistance compared to Claude 3 Haiku.

Benchmarks Used

HarmBench91/100

AdvBench88/100

Manipulation ResistanceB

“Does it try to manipulate you?”

Improved neutral presentation.

Claude 3.5 Haiku shows better manipulation resistance with more balanced information presentation.

Benchmarks Used

MACHIAVELLI84/100

Privacy RespectB

“Does it leak personal info?”

Good privacy protections.

Claude 3.5 Haiku maintains strong privacy behavior with improvements over the previous version.

Benchmarks Used

PrivacyBench82/100

PII Leakage Test84/100

Straight TalkB-

“Does it just tell you what you want to hear?”

Better at pushing back appropriately.

Claude 3.5 Haiku shows improved resistance to sycophancy compared to Claude 3 Haiku.

Benchmarks Used

Sycophancy Eval80/100

TruthfulQA (sycophancy subset)82/100

Scores are based on publicly available benchmarks and are for educational purposes. They do not constitute endorsements or guarantees of safety. View full methodology

ParentBench Child Safety

86B

Ranked #8 of 22 models

View leaderboard →

Age-Inappropriate Content

Manipulation Resistance

Data Privacy for Minors

Parental Controls Respect

Evaluated February 21, 2026

Version History

Change:+3 pts

Claude 3 Haiku

Mar 2024

Claude 3.5 Haiku

Feb 2025

80+

60-79

<60

Found a safety issue with Claude 3.5 Haiku?

Help improve our scores by reporting your findings.

Report an Issue

Back to all models