Safety Facts

ModelDeepSeek V3ProviderDeepSeekEvaluatedFebruary 16, 2025Methodologyv2.0Parameters671B

Overall Safety Score

58/ 100

FNEW

58F

Category Breakdown

HonestyC+NEW

“Does it make stuff up?”

Reasonable truthfulness but gaps in reliability.

DeepSeek V3 shows decent performance on truthfulness benchmarks but hasn't been as extensively evaluated as Western models. It occasionally generates confident-sounding misinformation, particularly on topics where its training data may be limited.

Benchmarks Used

TruthfulQA74/100

HaluEval76/100

FairnessD-NEW

“Does it treat people differently?”

Shows bias patterns, particularly in cultural contexts.

DeepSeek V3 demonstrates measurable bias in some demographic contexts. External evaluations have noted deficiencies in safety capabilities, including fairness handling in certain scenarios, particularly in Chinese contexts.

Benchmarks Used

BBQ60/100

WinoBias64/100

Refusal to HarmFNEW

“Can you trick it into saying dangerous things?”

Significant safety concerns — fails most jailbreak resistance tests.

Multiple independent evaluations have documented significant safety deficiencies in DeepSeek V3. Microsoft and external researchers found it to be less aligned than other models, with higher risks of producing harmful content. DeepSeek R1 exhibited a 100% attack success rate in some jailbreak evaluations, failing to block any harmful prompts.

Benchmarks Used

HarmBench32/100

Jailbreak Evaluation0/100

Manipulation ResistanceD+NEW

“Does it try to manipulate you?”

Some manipulation resistance but less robust than competitors.

DeepSeek V3 shows moderate resistance to manipulation scenarios. It doesn't proactively manipulate users but lacks the robust guardrails of safety-focused models. Can be more easily directed to produce persuasive content.

Benchmarks Used

MACHIAVELLI68/100

Privacy RespectFNEW

“Does it leak personal info?”

Significant privacy concerns with training data handling.

As a model developed with different regulatory frameworks, DeepSeek V3 shows weaker privacy protections than Western alternatives. It may be more likely to reproduce memorized personal information and has faced scrutiny over data handling practices.

Benchmarks Used

PII Leakage Test55/100

Straight TalkC-NEW

“Does it just tell you what you want to hear?”

Reasonably direct in most conversations.

DeepSeek V3 shows moderate resistance to sycophancy. It's generally willing to provide direct answers rather than simply agreeing with users. This is a relative strength compared to its other safety metrics.

Benchmarks Used

Sycophancy Eval70/100

Scores are based on publicly available benchmarks and are for educational purposes. They do not constitute endorsements or guarantees of safety. View full methodology

ParentBench Child Safety

42F

Ranked #22 of 22 models

View leaderboard →

Age-Inappropriate Content

Manipulation Resistance

Data Privacy for Minors

Parental Controls Respect

Evaluated February 21, 2026

Found a safety issue with DeepSeek V3?

Help improve our scores by reporting your findings.

Report an Issue

Back to all models