#attribution-methods News & Analysis

8 articles tagged with #attribution-methods. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

8 articles

AIBullisharXiv – CS AI · Jun 17/10

🧠

An Odd Estimator for Shapley Values

Researchers have proven that Shapley values, a key framework for attribution in machine learning, depend exclusively on the odd component of set functions. This theoretical breakthrough justifies the effectiveness of paired sampling and enables OddSHAP, a new estimator that achieves state-of-the-art accuracy by performing regression solely on the odd subspace using Fourier basis decomposition.

AINeutralarXiv – CS AI · May 127/10

🧠

Causal Dimensionality of Transformer Representations: Measurement, Scaling, and Layer Structure

Researchers introduce causal dimensionality (kappa), a measurable property quantifying how transformer layers causally influence model outputs, finding that representational capacity grows 15.6x faster than causal capacity across scaling conditions. The metric remains invariant to model size increases, suggesting causal influence is a fundamental architectural property independent of parameter count.

AINeutralarXiv – CS AI · Mar 97/10

🧠

From Features to Actions: Explainability in Traditional and Agentic AI Systems

Researchers demonstrate that traditional explainable AI methods designed for static predictions fail when applied to agentic AI systems that make sequential decisions over time. The study shows attribution-based explanations work well for static tasks but trace-based diagnostics are needed to understand failures in multi-step AI agent behaviors.

AINeutralarXiv – CS AI · Feb 277/105

🧠

Using the Path of Least Resistance to Explain Deep Networks

Researchers propose Geodesic Integrated Gradients (GIG), a new method for explaining AI model decisions that uses curved paths instead of straight lines to compute feature importance. The method addresses flawed attributions in existing approaches by integrating gradients along geodesic paths under a model-induced Riemannian metric.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Who Gets Credit or Blame? Attributing Accountability in Modern AI Systems

Researchers propose a framework to attribute AI model behavior to specific development stages (pretraining, fine-tuning, alignment), enabling accountability tracking without model retraining. The method quantifies how each stage contributes to model outputs and can identify spurious correlations, advancing transparency in AI development.

AINeutralarXiv – CS AI · May 276/10

🧠

Faithfulness Evaluation for Decoder-only LLM Attributions with Controlled Retained Information

Researchers propose π-Soft-NC and π-Soft-NS, improved evaluation metrics for assessing input attribution methods in large language models that control for the number of retained words, addressing a fundamental bias in existing faithfulness evaluation frameworks. They also introduce Grad-ELLM, a gradient-based attribution method designed for decoder-only LLMs that combines gradient and attention mechanisms for stronger explanatory performance.

🧠 Llama

AINeutralarXiv – CS AI · May 126/10

🧠

Attribution-based Explanations for Markov Decision Processes

Researchers have developed attribution techniques that explain decision-making in Markov Decision Processes (MDPs), extending explainability methods beyond static inputs to sequential decision-making systems. The approach assigns importance scores to states and execution paths, enabling more interpretable AI agents in dynamic environments.

AINeutralarXiv – CS AI · May 116/10

🧠

Frequency-Aware Model Parameter Explorer: A new attribution method for improving explainability

Researchers introduce FAMPE, a novel attribution method that uses frequency-domain analysis to improve explainability in deep neural networks. By separately perturbing high and low-frequency components through FFT-based techniques, the method outperforms existing attribution approaches on ImageNet across multiple architectures without requiring manual baseline selection.