y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#training-data-attribution News & Analysis

1 article tagged with #training-data-attribution. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AINeutralarXiv – CS AI · 18h ago7/10
🧠

Mechanistic Data Attribution: Tracing the Training Origins of Interpretable LLM Units

Researchers introduce Mechanistic Data Attribution (MDA), a framework using Influence Functions to trace interpretable units in large language models back to specific training samples. Through experiments on Pythia models, they demonstrate that targeted removal or augmentation of high-influence training samples causally affects the emergence of interpretable circuits, while providing direct evidence linking induction heads to in-context learning capabilities.