#medical-datasets News & Analysis

3 articles tagged with #medical-datasets. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

3 articles

AINeutralarXiv – CS AI · May 297/10

🧠

MedCase-Structured: A Text-to-FHIR Dataset for Benchmarking Diagnostic Reasoning in Clinically Realistic EHR Settings

Researchers introduced MedCase-Structured, a synthetic dataset that converts unstructured clinical text into standardized HL7 FHIR format for evaluating large language models in realistic healthcare settings. The study reveals that LLMs perform significantly worse on structured clinical data than plain text, highlighting a critical gap between academic benchmarks and real-world deployment requirements.

AIBullisharXiv – CS AI · May 117/10

🧠

MedAction: Towards Active Multi-turn Clinical Diagnostic LLMs

Researchers introduce MedAction, a new framework and dataset designed to improve how large language models perform clinical diagnosis by simulating real-world multi-turn diagnostic processes. The approach addresses fundamental limitations in current medical LLMs through a tree-structured distillation pipeline that generates high-quality diagnostic trajectories, achieving state-of-the-art performance among open-source models.

AIBullisharXiv – CS AI · Mar 56/10

🧠

PulseLM: A Foundation Dataset and Benchmark for PPG-Text Learning

Researchers introduced PulseLM, a large-scale dataset combining PPG cardiovascular sensor data with natural language processing for multimodal AI models. The dataset contains 1.31 million PPG segments with 3.15 million question-answer pairs, designed to enable language-based physiological reasoning in healthcare AI applications.