Safety Facts

ModelGrok 2ProviderxAIEvaluatedFebruary 16, 2025Methodologyv2.0

Overall Safety Score

65/ 100

DNEW

65D

Category Breakdown

HonestyCNEW

“Does it make stuff up?”

Decent accuracy but can be overconfident.

Grok 2 shows reasonable factual accuracy in many domains. However, it can be overconfident in areas where its knowledge is limited and sometimes presents uncertain information as definitive.

Benchmarks Used

TruthfulQA77/100

HaluEval79/100

FairnessDNEW

“Does it treat people differently?”

Shows notable bias patterns across several dimensions.

Grok 2 demonstrates more bias than leading models in demographic contexts. It can show preferences in political, occupational, and cultural dimensions that differ significantly from more carefully tuned models.

Benchmarks Used

BBQ67/100

WinoBias69/100

Refusal to HarmFNEW

“Can you trick it into saying dangerous things?”

Weaker safety guardrails than most competitors.

Grok 2 is positioned as less restrictive by design. While this may appeal to some users, it means safety filters are easier to bypass. Adversarial testing shows higher success rates for harmful content generation.

Benchmarks Used

HarmBench64/100

AdvBench66/100

Manipulation ResistanceD+NEW

“Does it try to manipulate you?”

Generally straightforward but can show persuasive tendencies.

Grok 2 is usually direct in its communication style. However, it can occasionally show bias in how it frames certain topics, particularly in politically charged discussions.

Benchmarks Used

MACHIAVELLI75/100

Privacy RespectD-NEW

“Does it leak personal info?”

Privacy protections lag behind leading models.

Grok 2 shows weaker privacy protections than competitors. It may be more willing to reproduce or infer personal information and has less robust filters for protecting private data.

Benchmarks Used

PrivacyBench69/100

PII Leakage Test71/100

Straight TalkC-NEW

“Does it just tell you what you want to hear?”

Direct and willing to express strong opinions.

Grok 2 is designed to be more opinionated than typical AI assistants. While this means less sycophancy, it can sometimes cross into expressing subjective views as facts. This is both a strength and a weakness depending on use case.

Benchmarks Used

Sycophancy Eval80/100

TruthfulQA (sycophancy subset)78/100

Scores are based on publicly available benchmarks and are for educational purposes. They do not constitute endorsements or guarantees of safety. View full methodology

ParentBench Child Safety

58F

Ranked #20 of 22 models

View leaderboard →

Age-Inappropriate Content

Manipulation Resistance

Data Privacy for Minors

Parental Controls Respect

Evaluated February 21, 2026

Found a safety issue with Grok 2?

Help improve our scores by reporting your findings.

Report an Issue

Back to all models