Safety Facts

ModelGemini 2.0 ProProviderGoogleEvaluatedFebruary 16, 2025Methodologyv2.0

Overall Safety Score

85/ 100

Bvs gemini-1-5-pro

85B

Category Breakdown

HonestyB

“Does it make stuff up?”

Strong factual accuracy with good uncertainty expression.

Gemini 2.0 Pro shows notable improvements in truthfulness. It better distinguishes between confident knowledge and uncertain areas. Hallucination rates have decreased compared to the previous version.

Benchmarks Used

TruthfulQA88/100

HaluEval90/100

FairnessB+

“Does it treat people differently?”

Handles diverse perspectives with improved balance.

Gemini 2.0 Pro demonstrates good fairness across demographic questions. It shows reduced bias in professional and cultural contexts and generally presents balanced viewpoints on sensitive topics.

Benchmarks Used

BBQ87/100

WinoBias89/100

Refusal to HarmB+

“Can you trick it into saying dangerous things?”

Strong safety measures with better context awareness.

Gemini 2.0 Pro has robust safety guardrails that effectively block harmful content while being more nuanced about legitimate use cases. It resists most adversarial attempts while remaining helpful.

Benchmarks Used

HarmBench92/100

AdvBench90/100

Manipulation ResistanceB

“Does it try to manipulate you?”

Generally fair but occasionally shows persuasive tendencies.

Gemini 2.0 Pro usually presents information neutrally, though it can sometimes be slightly persuasive in its framing. It generally avoids emotional manipulation and respects user autonomy in decision-making.

Benchmarks Used

MACHIAVELLI84/100

Privacy RespectB-

“Does it leak personal info?”

Good privacy practices with room for improvement.

Gemini 2.0 Pro shows reasonable privacy protections. It generally declines to share private information and shows moderate rates of training data memorization compared to peers.

Benchmarks Used

PrivacyBench84/100

PII Leakage Test86/100

Straight TalkC+

“Does it just tell you what you want to hear?”

Improved at pushback but can still be somewhat agreeable.

Gemini 2.0 Pro shows moderate improvement in resisting sycophancy. It's more willing to politely disagree than before, though it can still sometimes prioritize user comfort over honest correction.

Benchmarks Used

Sycophancy Eval82/100

TruthfulQA (sycophancy subset)84/100

Scores are based on publicly available benchmarks and are for educational purposes. They do not constitute endorsements or guarantees of safety. View full methodology

ParentBench Child Safety

85B

Ranked #9 of 22 models

View leaderboard →

Age-Inappropriate Content

Manipulation Resistance

Data Privacy for Minors

Parental Controls Respect

Evaluated February 21, 2026

Found a safety issue with Gemini 2.0 Pro?

Help improve our scores by reporting your findings.

Report an Issue

Back to all models