Skip to content
Longterm Wiki
Back

systematic evaluation of medical vision-language models

paper

Authors

Zikun Guo·Jingwei Lv·Xinyue Xu·Shu Yang·Jun Wen·Di Wang·Lijie Hu

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: arXiv

This paper introduces a systematic benchmark for evaluating sycophancy and visual susceptibility in medical vision-language models, addressing critical patient safety concerns in AI deployment for medical workflows.

Paper Details

Citations
0
Year
2025

Metadata

arxiv preprintprimary source

Abstract

Visual language models (VLMs) have the potential to transform medical workflows. However, the deployment is limited by sycophancy. Despite this serious threat to patient safety, a systematic benchmark remains lacking. This paper addresses this gap by introducing a Medical benchmark that applies multiple templates to VLMs in a hierarchical medical visual question answering task. We find that current VLMs are highly susceptible to visual cues, with failure rates showing a correlation to model size or overall accuracy. we discover that perceived authority and user mimicry are powerful triggers, suggesting a bias mechanism independent of visual data. To overcome this, we propose a Visual Information Purification for Evidence based Responses (VIPER) strategy that proactively filters out non-evidence-based social cues, thereby reinforcing evidence based reasoning. VIPER reduces sycophancy while maintaining interpretability and consistently outperforms baseline methods, laying the necessary foundation for the robust and secure integration of VLMs.

Summary

This paper addresses the critical safety issue of sycophancy in medical vision-language models (VLMs) by introducing a systematic benchmark for evaluating their susceptibility to visual and social cues in medical visual question answering tasks. The authors find that current VLMs are highly vulnerable to non-evidence-based triggers like perceived authority and user mimicry, with failure rates correlating to model size. To mitigate this risk, they propose VIPER (Visual Information Purification for Evidence based Responses), a strategy that filters out social cues to reinforce evidence-based reasoning, demonstrating improved robustness while maintaining interpretability.

Cited by 1 page

PageTypeQuality
Epistemic SycophancyRisk60.0

Cached Content Preview

HTTP 200Fetched Mar 20, 202698 KB
[License: arXiv.org perpetual non-exclusive license](https://info.arxiv.org/help/license/index.html#licenses-available)

arXiv:2509.21979v1 \[cs.CV\] 26 Sep 2025

# Benchmarking and Mitigate Psychological Sycophancy in Medical Vision-Language Models

Report issue for preceding element

Zikun Guo1,
Xinyue Xu2,
Pei Xiang4,
Shu Yang3,
Xin Han5,
Di Wang†,3,
Lijie Hu†,1

1Mohamed bin Zayed University of Artificial Intelligence

2Hong Kong University of Science and Technology

3King Abdullah University of Science and Technology

4Xidian University

5Zhejiang A&F University

Report issue for preceding element

###### Abstract

Report issue for preceding element

Vision language models(VLMs) are increasingly integrated into clinical workflows, but they often exhibit sycophantic behavior prioritizing alignment with user phrasing social cues or perceived authority over evidence based reasoning. This study evaluate clinical sycophancy in medical visual question answering through a novel clinically grounded benchmark. We propose a medical sycophancy dataset construct from PathVQA, SLAKE, and VQA-RAD stratified by different type organ system and modality. Using psychologically motivated pressure templates including various sycophancy. In our adversarial experiments on various VLMs, we found that these models are generally vulnerable, exhibiting significant variations in the occurrence of adversarial responses, with weak correlations to the model accuracy or size. Imitation and expert provided corrections were found to be the most effective triggers, suggesting that the models possess a bias mechanism independent of visual evidence. To address this, we propose Visual Information Purification for Evidence based Response (VIPER) a lightweight mitigation strategy that filters non evidentiary content for example social pressures and then generates constrained evidence first answers. This framework reduces sycophancy by an average amount outperforming baselines while maintaining interpretability. Our benchmark analysis and mitigation framework lay the groundwork for robust deployment of medical VLMs in real world clinician interactions emphasizing the need for evidence anchored defenses.

Report issue for preceding element

## 1 Introduction

Report issue for preceding element

Artificial intelligence has become integral to medical imaging and clinical decision support, with vision language models (VLMs) enabling instruction following interfaces over visual evidence \[ [24](https://arxiv.org/html/2509.21979v1#bib.bib24 ""), [18](https://arxiv.org/html/2509.21979v1#bib.bib18 "")\]. While aggregate accuracy on public benchmarks continues to improve \[ [17](https://arxiv.org/html/2509.21979v1#bib.bib17 ""), [13](https://arxiv.org/html/2509.21979v1#bib.bib13 ""), [29](https://arxiv.org/html/2509.21979v1#bib.bib29 "")\], safe deployment in healthcare requires reliability under distributional and interactional shift. Such reliability includes calibrated confidence, princ

... (truncated, 98 KB total)
Resource ID: 7e220ec9cf1809b8 | Stable ID: OTc1MzA5OD