Safety Facts

ModelCommand R+ProviderCohereEvaluatedFebruary 16, 2025Methodologyv2.0Parameters104B

Overall Safety Score

74/ 100

CNEW

74C

Category Breakdown

HonestyC+NEW

“Does it make stuff up?”

Solid honesty, especially good at citing sources when it has them.

Command R+ performs well on honesty benchmarks, partly due to Cohere's focus on retrieval-augmented generation. When connected to sources, it's quite accurate. Without source material, it can hallucinate at rates similar to other mid-tier models.

Benchmarks Used

TruthfulQA81/100

HaluEval83/100

FairnessC+NEW

“Does it treat people differently?”

Reasonable fairness with a focus on responsible AI principles.

Command R+ benefits from Cohere's emphasis on responsible AI. It handles most bias scenarios reasonably well, showing particular strength in avoiding harmful generalizations. Some subtle biases remain, particularly around occupational stereotypes.

Benchmarks Used

BBQ79/100

WinoBias81/100

Refusal to HarmC-NEW

“Can you trick it into saying dangerous things?”

Decent safety guardrails, though not as robust as the top tier.

Command R+ has reasonable safety training and refuses most obviously harmful requests. However, its adversarial robustness is in the middle of the pack — more sophisticated jailbreak attempts can sometimes get through its defenses.

Benchmarks Used

HarmBench76/100

AdvBench74/100

Manipulation ResistanceCNEW

“Does it try to manipulate you?”

Generally straightforward but not the most rigorous about flagging manipulation.

Command R+ doesn't proactively manipulate users and generally behaves ethically in conversations. Its main weakness is that it can sometimes be used to generate subtly manipulative content without including appropriate warnings.

Benchmarks Used

MACHIAVELLI77/100

Privacy RespectC-NEW

“Does it leak personal info?”

Mid-range privacy protections with enterprise-focused design.

Command R+ has reasonable privacy protections, consistent with Cohere's enterprise focus. It generally respects privacy boundaries but can occasionally reproduce memorized personal information from training data when prompted creatively.

Benchmarks Used

PrivacyBench75/100

PII Leakage Test77/100

Straight TalkCNEW

“Does it just tell you what you want to hear?”

Moderately good at pushing back, but can still be a people-pleaser.

Command R+ shows moderate sycophancy levels. It will sometimes push back on incorrect user statements but can also be swayed by confident assertions. It strikes a middle ground between being agreeable and being accurate.

Benchmarks Used

Sycophancy Eval77/100

TruthfulQA (sycophancy subset)79/100

Scores are based on publicly available benchmarks and are for educational purposes. They do not constitute endorsements or guarantees of safety. View full methodology

ParentBench Child Safety

71C-

Ranked #18 of 22 models

View leaderboard →

Age-Inappropriate Content

Manipulation Resistance

Data Privacy for Minors

Parental Controls Respect

Evaluated February 21, 2026

Found a safety issue with Command R+?

Help improve our scores by reporting your findings.

Report an Issue

Back to all models