y0news
← Feed
Back to feed
🧠 AI🔴 BearishImportance 6/10

MedDialBench: Benchmarking LLM Diagnostic Robustness under Parametric Adversarial Patient Behaviors

arXiv – CS AI|Xiaotian Luo, Xun Jiang, Jiangcheng Wu|
🤖AI Summary

Researchers introduce MedDialBench, a comprehensive benchmark testing how large language models maintain diagnostic accuracy when patients exhibit adversarial behaviors across five dimensions. The study reveals that fabricating symptoms causes 1.7-3.4x larger accuracy drops than withholding information, with worst-case performance degradation ranging from 38.8 to 54.1 percentage points across tested models.

Analysis

MedDialBench addresses a critical gap in AI safety evaluation by systematically measuring LLM robustness against patient non-cooperation in medical diagnostics. Rather than applying adversarial behaviors arbitrarily, the benchmark uses a controlled factorial design decomposing patient behavior into five graded dimensions: Logic Consistency, Health Cognition, Expression Style, Disclosure, and Attitude. This methodological rigor enables researchers to isolate individual effects and detect cross-dimension interactions.

The research emerges from growing awareness that LLMs perform unpredictably when users deviate from cooperative interaction patterns. Healthcare contexts amplify this risk—patients may inadvertently fabricate symptoms due to cognitive biases, withhold information due to embarrassment, or express symptoms inconsistently across consultations. Prior benchmarks failed to measure severity gradations or analyze how multiple adversarial dimensions compound, leaving blind spots in deployment readiness.

The findings carry significant implications for AI-assisted medical systems. The 1.7-3.4x differential impact between information pollution and deficit suggests that data quality matters far more than data quantity in medical LLM applications. The super-additive interaction pattern when fabrication combines with other adversarial behaviors indicates that robustness improvements targeting single dimensions may prove insufficient—models need fundamentally different architectures to resist coordinated patient behaviors. The observation that exhaustive questioning cannot compensate for false information reveals inherent limitations in inquiry-based mitigation strategies.

These results should inform clinical deployment standards, suggesting that LLMs require additional safeguards—such as multi-turn verification protocols or confidence calibration mechanisms—before integration into diagnostic workflows. Future research should explore whether fine-tuning on adversarial dialogues improves robustness or merely shifts vulnerability patterns.

Key Takeaways
  • Fabricated symptoms cause 1.7-3.4x larger diagnostic accuracy drops than withheld information across LLMs
  • Super-additive interaction effects occur only when fabrication combines with other adversarial behaviors, not in non-fabricating dimension pairs
  • Exhaustive questioning recovers withheld information but cannot compensate for false patient inputs
  • Individual LLM vulnerability profiles vary significantly, with worst-case performance degradation spanning 15.3 percentage points between most and least robust models
  • MedDialBench's controlled factorial design enables dose-response profiling and cross-dimension interaction detection previously impossible in existing benchmarks
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles