#influence-functions News & Analysis

2 articles tagged with #influence-functions. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles

AINeutralarXiv – CS AI · Jun 97/10

🧠

Mechanistic Data Attribution: Tracing the Training Origins of Interpretable LLM Units

Researchers introduce Mechanistic Data Attribution (MDA), a framework using Influence Functions to trace interpretable units in large language models back to specific training samples. Through experiments on Pythia models, they demonstrate that targeted removal or augmentation of high-influence training samples causally affects the emergence of interpretable circuits, while providing direct evidence linking induction heads to in-context learning capabilities.

AINeutralarXiv – CS AI · Jun 236/10

🧠

Towards Dys-XAI: Influence-Based Explanations for Dysarthria Severity Assessment

Researchers propose Dys-XAI, an influence-based explainability framework that makes deep learning predictions for dysarthria severity assessment interpretable by linking decisions to similar training examples. The method uses gradient-based influence approximations to identify supportive and competing samples, with validation experiments confirming that removing influential samples systematically alters predictions, addressing a critical gap between model performance and clinical adoptability.