Research Bearish

AI-Generated X-Rays Deceive Radiologists and Top-Tier LLMs in New Study

A study published in Radiology reveals that synthetic X-ray images created by AI tools like ChatGPT and RoentGen can successfully fool experienced medical professionals and advanced AI models. The findings highlight critical vulnerabilities in medical imaging, ranging from legal fraud to systemic cybersecurity risks in healthcare infrastructure.

Mar 26, 2026 · 3 min read · By AI Intelligence Brief Editorial

Key Takeaways

A study published in Radiology reveals that synthetic X-ray images created by AI tools like ChatGPT and RoentGen can successfully fool experienced medical professionals and advanced AI models.
The findings highlight critical vulnerabilities in medical imaging, ranging from legal fraud to systemic cybersecurity risks in healthcare infrastructure.

Mentioned

ChatGPT product RoentGen product OpenAI company Google company GOOGL Meta Platforms company META GPT-4o technology GPT-5 product Gemini 2.5 Pro technology Llama 4 Maverick technology Dr. Mickael Tordjman person Icahn School of Medicine at Mount Sinai company

Key Intelligence

Key Facts

117 radiologists from 12 hospitals in 6 countries participated in the study
2Only 41% of radiologists spontaneously identified AI-generated images when uninformed
3Radiologist accuracy rose to 75% after being told the dataset contained synthetic images
4AI detection accuracy for GPT-4o, GPT-5, Gemini 2.5 Pro, and Llama 4 Maverick ranged from 57% to 85%
5GPT-4o failed to detect all deepfakes even though it was the model that created them

Group
Radiologists (Uninformed)	41%	Spontaneous identification during routine review
Radiologists (Informed)	75%	Actively looking for synthetic images
Top-Performing AI (GPT-4o)	85%	Highest detection rate among tested LLMs
Low-Performing AI	57%	Lower bound of LLM detection capability

Who's Affected

Radiologists

personNegative

Hospitals

companyNegative

Legal System

technologyNegative

AI Developers

companyNeutral

Analysis

The emergence of high-fidelity synthetic media has moved beyond deepfake videos into the sensitive realm of medical diagnostics. A recent study led by Dr. Mickael Tordjman of the Icahn School of Medicine at Mount Sinai demonstrates that AI-generated X-rays are now indistinguishable from real patient data to the untrained—and even trained—eye. This isn't just a technical curiosity; it represents a fundamental threat to the integrity of digital medical records and the trust that underpins modern healthcare systems globally.

The study's methodology was rigorous, involving 17 radiologists from 12 hospitals across six countries. They were presented with 264 X-ray images, half of which were generated by AI tools such as ChatGPT and RoentGen. The results were startling: when the radiologists were unaware of the study’s true purpose, only 41 percent spontaneously identified the AI-generated images. Even after being warned that the dataset contained synthetic images, their mean accuracy only rose to 75 percent. This suggests that even when clinicians are actively looking for fakes, one in four synthetic images can still pass as real, potentially leading to misdiagnosis or unnecessary medical interventions.

The study tested four of the most advanced large language models currently available: OpenAI’s GPT-4o and GPT-5, Google’s Gemini 2.5 Pro, and Meta’s Llama 4 Maverick.

Perhaps more concerning is the performance of the AI models themselves. The study tested four of the most advanced large language models currently available: OpenAI’s GPT-4o and GPT-5, Google’s Gemini 2.5 Pro, and Meta’s Llama 4 Maverick. Their accuracy in detecting the fake images ranged from 57 to 85 percent. In a particularly ironic twist, GPT-4o—the very model used to create some of the deepfakes—failed to detect all of them. While it outperformed the other models, its inability to reliably identify its own output underscores a significant black box problem in AI development: we are creating tools that can generate content with a level of sophistication that exceeds our current ability to verify or audit it.

What to Watch

The implications of this development are multifaceted and high-stakes. From a legal perspective, the ability to fabricate a fracture or a tumor that is indistinguishable from a real one creates a massive vulnerability for fraudulent litigation and insurance claims. In a clinical setting, if a hospital’s network were compromised, hackers could inject synthetic images into patient records. This could lead to unnecessary surgeries, delayed treatments, or a complete breakdown of trust in digital medical records. Dr. Tordjman’s warning that we are seeing the tip of the iceberg suggests that as these generative models become more accessible and powerful, the potential for clinical chaos grows exponentially.

To mitigate these risks, the research community is calling for robust digital safeguards. One proposed solution is the implementation of invisible watermarks embedded directly into the metadata or pixel structure of medical images at the point of capture. This would create a verifiable chain of custody for every X-ray, MRI, and CT scan. However, as the detection accuracy of even the most advanced LLMs shows, the arms race between synthetic generation and forensic detection is only just beginning. The medical community must now treat the integrity of imaging data with the same level of security and scrutiny as patient privacy and financial records. Future research will likely focus on developing specialized detection models that can outperform general-purpose LLMs in identifying subtle synthetic artifacts.

"AI-Generated X-Rays Deceive Radiologists and Top-Tier LLMs in New Study." AI Intelligence Brief, March 26, 2026. https://getaibrief.com/story/ai-fake-xrays-fool-radiologists-study

From the Network

Healthcare

AI-Generated Medical Deepfakes Fool Radiologists and LLMs Alike

A landmark study from the Icahn School of Medicine at Mount Sinai reveals that synthetic X-ray images created by AI can deceive experienced radiologists and even the advanced models that generated the

16w ago Legal

AI-Generated Medical Deepfakes Threaten Litigation Integrity and Cybersecurity

A landmark study reveals that AI-generated X-rays can deceive both human radiologists and advanced AI models, posing a severe risk for fraudulent litigation and medical record integrity. Researchers w

16w ago

How we covered this story

Every story in our AI coverage is assembled from multiple primary sources, cross-referenced for factual consistency, and scored along three independent dimensions: sentiment, operational impact, and source-cluster confidence. Single-source rumors and unverifiable claims do not pass our editorial gate. When a story shows "Verified by N sources" with N≥2, the development is independently corroborated; when N=1, we mark it explicitly so readers can weigh the signal accordingly.

Impact scoring uses a 1-10 scale weighted toward regulatory, financial, and operational consequence rather than coverage volume. A topic that runs in every outlet but moves no real decisions ranks lower than a niche regulatory filing that reshapes how operators in the AI space have to behave. Read our full methodology for the scoring rubric, our glossary for term definitions, and our trends index for the longitudinal view across the beat.

Sources are only linked to a story once they clear our classification pipeline at a minimum 35 percent relevance threshold. According to that methodology, reviewed July 2026, this follows multi-source corroboration standards recommended by journalism research bodies such as the Reuters Institute for the Study of Journalism.

See something wrong in this story — a wrong fact, a broken source link, a misattributed entity? Report a data issue.

Signal on this page	What it tells you
Verified by N sources	Independent corroboration count. N≥2 is our confidence floor; N=1 is marked explicitly.
Impact score (1-10)	Regulatory + financial + operational weight. 8+ signals an experienced-operator action item.
Sentiment	Five-tier classification trained on labeled AI-specific corpora.
Timeline	Where applicable, the related-events sequence that contextualizes today's development.

Key Takeaways

Mentioned

Key Intelligence

Key Facts

Who's Affected

Analysis

What to Watch

Cite This Page

Related Stories

Apple’s $4.88T AI Pivot: Privacy-First Strategy Dethrones Nvidia

DeepMind’s Free 56-Hour LLM Curriculum & 25-Language Gemini Live Hit India

China's AI Triples Shrimp Farm Income: Blueprint for Lightweight AI in Global South

OpenAI’s 400+ Apple Alumni Caught in Trade-Secret Crackdown Over AI Hardware

From the Network

AI-Generated Medical Deepfakes Fool Radiologists and LLMs Alike

AI-Generated Medical Deepfakes Threaten Litigation Integrity and Cybersecurity

How we covered this story