AINeutralarXiv – CS AI · 6h ago6/10
🧠
Systematic Evaluation of Large Language Models for Post-Discharge Clinical Action Extraction
Researchers systematically evaluated large language models against supervised BERT models for extracting post-discharge clinical actions from narrative hospital notes. LLMs matched or exceeded supervised baselines on binary actionability detection but lagged on fine-grained multi-label classification, revealing that performance gaps stem from misalignment between model reasoning and annotation conventions rather than pure capability limitations.