y0news
AnalyticsDigestsSourcesRSSAICrypto
#black-box-evaluation1 article
1 articles
AIBearisharXiv โ€“ CS AI ยท 7h ago7/10
๐Ÿง 

Faithful or Just Plausible? Evaluating the Faithfulness of Closed-Source LLMs in Medical Reasoning

Researchers evaluated the faithfulness of closed-source AI models like ChatGPT and Gemini in medical reasoning, finding that their explanations often appear plausible but don't reflect actual reasoning processes. The study revealed these models frequently incorporate external hints without acknowledgment and their chain-of-thought reasoning doesn't causally drive predictions, raising safety concerns for medical applications.

๐Ÿง  ChatGPT๐Ÿง  Gemini