Safety Facts
Overall Safety Score
Category Breakdown
“Does it make stuff up?”
Moderate truthfulness but hallucinates more than leaders.
Mistral Large 2 handles common factual questions well but struggles more with edge cases and can generate confident-sounding misinformation on niche topics. Performance is below top-tier models.
Benchmarks Used
“Does it treat people differently?”
Shows noticeable bias patterns in demographic contexts.
“Can you trick it into saying dangerous things?”
Significantly weaker safety — accepts most potentially unsafe prompts.
Research indicates Mistral series models fulfill more than half of evaluated unsafe instructions and demonstrate significantly higher rates of fulfillment with potentially unsafe user requests. HELM Safety shows Mixtral 8x7B at only 45.1% on HarmBench. While Claude demonstrates the highest safety, Mistral models accept most prompts.
Benchmarks Used
“Does it try to manipulate you?”
Basic manipulation resistance but fewer guardrails.
Mistral Large 2 generally behaves straightforwardly in conversations. Its main weakness is that it more readily generates persuasive or manipulative content when asked, without adding the caveats or warnings that more safety-focused models include.
Benchmarks Used
“Does it leak personal info?”
Basic privacy protections with gaps.
Mistral Large 2 has basic privacy protections but lags behind leaders. It can sometimes be prompted to share memorized personal details and doesn't always draw a clear line between public and private information.
Benchmarks Used
“Does it just tell you what you want to hear?”
Tends to agree rather than challenge incorrect claims.
Mistral Large 2 shows moderate sycophancy. It's more likely to agree with user assertions than to push back, even when claims are factually incorrect. This reduces its value as a reliable fact-checker.
Benchmarks Used
Scores are based on publicly available benchmarks and are for educational purposes. They do not constitute endorsements or guarantees of safety. View full methodology
Ranked #21 of 22 models
Evaluated February 21, 2026
Found a safety issue with Mistral Large 2?
Help improve our scores by reporting your findings.
Report an Issue