y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#model-explainability News & Analysis

5 articles tagged with #model-explainability. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

5 articles
AIBullisharXiv – CS AI · 4d ago7/10
🧠

MedGuideX: Internalizing Decision Logic from Executable Guidelines into Large Language Models for Clinical Reasoning

Researchers introduce MedGuideX, a medical language model trained on executable clinical decision logic extracted from practice guidelines, achieving 10.28% accuracy improvement over existing methods. The approach transforms procedural guideline structures into synthetic training data that teaches models both correct clinical decisions and counterfactual reasoning, with physician validation confirming improved explanation quality.

AIBullisharXiv – CS AI · Apr 137/10
🧠

Revitalizing Black-Box Interpretability: Actionable Interpretability for LLMs via Proxy Models

Researchers propose a cost-effective proxy model framework that uses smaller, efficient models to approximate the interpretability explanations of expensive Large Language Models (LLMs), achieving over 90% fidelity at just 11% of computational cost. The framework includes verification mechanisms and demonstrates practical applications in prompt compression and data cleaning, making interpretability tools viable for real-world LLM development.

AINeutralarXiv – CS AI · 4d ago6/10
🧠

How Reliable are LLMs for Reasoning on the Re-ranking task?

Researchers investigate whether Large Language Models reliably perform re-ranking tasks by analyzing how different training methods affect semantic understanding and reasoning transparency. The study reveals that some training approaches produce better explainability than others, suggesting LLMs may optimize for evaluation metrics rather than genuine semantic comprehension, raising concerns about their actual reliability in ranking applications.

AINeutralarXiv – CS AI · Apr 206/10
🧠

LLM attribution analysis across different fine-tuning strategies and model scales for automated code compliance

Researchers conducted a comparative study of how large language models trained with different fine-tuning methods (full fine-tuning, LoRA, and quantized LoRA) interpret code compliance tasks. The study reveals that full fine-tuning produces more focused attribution patterns than parameter-efficient methods, and larger models develop distinct interpretive strategies despite performance gains plateauing above 7B parameters.