AI Insight
A multi-faceted evaluation of ten large language models (LLMs), including ChatGPT-4o, Gemini-2.0-Flash, DeepSeek-V3, and Qwen3, found that all models exhibited systematic implicit biases across six clinical categories, with the strongest biases detected in race and socioeconomic status. Three complementary assessment methods were used, and results showed that stronger implicit associations significantly predicted discriminatory outcomes in downstream medical decision-making tasks (p < 0.001). Notably, advanced reasoning techniques such as Chain-of-Thought prompting did not meaningfully reduce the magnitude of these biases, suggesting that current safety alignment strategies are insufficient.
Why it matters
As LLMs are increasingly deployed in clinical decision support and patient communication, unaddressed implicit biases risk exacerbating existing health disparities and undermining equitable care. Healthcare professionals should treat AI outputs as fallible second opinions requiring critical human oversight rather than as objective or authoritative guidance.
by Qiufeng Jia, Yuhang Wen, Yuyan Liu, Hui Zhao, Qiongge Yu, Yu Long, Dan Sun, Yufeng Yu
Background
Large language models are increasingly integrated into healthcare for clinical decision support and patient communication. Although these models can pass explicit social bias tests, they may retain implicit biases—latent associations between social groups and attributes—that could influence medical judgment.
Objective
To systematically evaluate the presence, magnitude, and behavioral impact of implicit biases in large language models within the medical domain across six high-stakes categories: gender, race, socioeconomic status, health conditions, religion, and healthcare systems.
Design
A descriptive cross-sectional study using a multi-faceted evaluation framework.
Setting(s)
Computational analysis of 10 mainstream global large language models, including proprietary models (ChatGPT-4o, Gemini-2.0-Flash) and open-source models (DeepSeek-V3, Qwen3).
Methods
We constructed 24 medical bias datasets across six categories. Bias was assessed using three methods: (1) the Large Language Model Word Association Test, a prompt-based method for revealing implicit biases; (2) the Large Language Model Relative Decision Test, a strategy for detecting subtle discrimination in situational decision-making; (3) Paired-Prompt Analysis, used to examine whether implicit associations predict discriminatory decisions.
Results
All 10 models exhibited systematic implicit biases (Mean IAT Bias > 0) across all categories, with the strongest biases observed in Race (Mean = 0.61) and Socioeconomic Status (Mean = 0.56). Advanced reasoning capabilities (Chain-of-Thought) did not significantly reduce bias magnitude. Crucially, stronger implicit associations significantly predicted discriminatory choices in downstream medical decision tasks (p < 0.001).
Conclusion
Current safety alignment techniques fail to eliminate implicit biases in large language models within the medical domain. These latent associations translate into biased decision-making, posing risks for health equity. Future development must prioritize representational debiasing over superficial alignment. Furthermore, healthcare professionals must embrace a stance of “AI vigilance”: they should critically evaluate algorithmic outputs as fallible “second opinions” rather than objective truths, thereby ensuring that human judgment remains the ultimate safeguard for equitable patient care.