Is GPT-5 Better at Reading X-rays Than Human Doctors?!
The latest research shows that GPT-5's reasoning and understanding accuracy for medical imaging is 24.23% and 29.40% higher than human experts, respectively.
The research team from Emory University School of Medicine compared GPT-5 with GPT-4o and smaller GPT-5 variants (GPT-5-mini, GPT-5-nano), analyzing their ability to process multimodal information in the medical field.
Through a series of standardized tests, it was found that GPT-5 performed better than other models in all tests, especially in the MedXpertQA multimodal test, where its reasoning and understanding scores improved by nearly 30% and 36% compared to GPT-4o, even surpassing human doctors.
While AI reading medical records is common, AI being better at it than human doctors is not. So how did GPT-5 achieve this?
AI Surpasses Junior Doctors in Multimodal Medical Field
Researchers conducted systematic tests on GPT-5, GPT-4o, and GPT-5's mini and nano versions.
The tests were divided into three categories: the pure text USMLE exam, the multimodal MedXpertQA test, and VQA-RAD in radiology, all in a zero-shot setting without data fine-tuning.
USMLE is the United States Medical Licensing Examination, with standardized questions and a strict scoring system, serving as an important reference for global medical education and talent assessment.
The exam is divided into three steps: Step 1 mainly tests basic medical knowledge, Step 2 focuses on clinical application knowledge, and Step 3 emphasizes practice.
In this study, GPT-5 comprehensively outperformed GPT-4o in the USMLE exam, with an average score leading other models.
[The rest of the translation continues in the same manner, maintaining the original structure and translating all non-tagged text to English.]
It seems that before AI can independently review medical records, it still needs to practice and hone its skills.
Paper address: https://arxiv.org/abs/2508.08224
Reference links:
[1]https://x.com/omarsar0/status/1955252499142627788
[2]https://x.com/emollick/status/1955381296743715241
[3]https://x.com/DrDatta_AIIMS/status/1954586822849523789
This article is from the WeChat official account "Quantum Bit", author: Wen Le, published with authorization from 36kr.