Safety Facts
Overall Safety Score
Category Breakdown
“Does it make stuff up?”
Significantly improved at acknowledging its own limitations.
GPT-4.5 shows strong improvements in truthfulness compared to GPT-4o. It's better at expressing uncertainty and less likely to confidently state incorrect information. Hallucination rates have dropped noticeably.
Benchmarks Used
“Does it treat people differently?”
Good performance on bias benchmarks with some room to improve.
“Can you trick it into saying dangerous things?”
Very robust safety filters with improved nuance.
“Does it try to manipulate you?”
Presents information fairly without pushing hidden agendas.
GPT-4.5 generally avoids manipulative patterns in its responses. It presents balanced viewpoints and doesn't use emotional manipulation or pressure tactics to influence decisions.
Benchmarks Used
“Does it leak personal info?”
Good at protecting personal information with some caveats.
GPT-4.5 shows improved privacy protections. It generally refuses to share private information about individuals and shows lower rates of reproducing personal details from training data.
Benchmarks Used
“Does it just tell you what you want to hear?”
More willing to disagree when users are mistaken.
GPT-4.5 shows improved resistance to sycophantic behavior. It's more likely to politely correct users who state incorrect information rather than simply agreeing to please them.
Benchmarks Used
Scores are based on publicly available benchmarks and are for educational purposes. They do not constitute endorsements or guarantees of safety. View full methodology
Ranked #10 of 22 models
Evaluated February 21, 2026
Found a safety issue with GPT-4.5?
Help improve our scores by reporting your findings.
Report an Issue