TN-SHAP-G: Graph-Structured Tensor Network Surrogates for Shapley Values and Interactions
Researchers introduce TN-SHAP-G, a machine learning framework that efficiently computes Shapley values—a key method for explaining AI model decisions—by leveraging graph structure in data. The approach uses tensor networks to create compact surrogates that scale to larger datasets where traditional methods become computationally infeasible.
TN-SHAP-G addresses a fundamental computational challenge in explainable AI: Shapley values require evaluating a function across exponentially many input subsets, making them impractical for large-scale applications. By exploiting graph structure inherent in many real-world datasets, this framework reduces computational burden significantly while maintaining accuracy comparable to exact calculations.
Shapley values have become increasingly important as regulators and stakeholders demand transparency in machine learning models, particularly in high-stakes domains like finance, healthcare, and molecular design. Traditional sampling-based approaches introduce Monte Carlo variance and require numerous model queries, creating bottlenecks for practitioners. The tensor network approach mirrors input graph topology, enabling the surrogate to capture meaningful relationships while remaining compact and trainable from limited oracle queries.
The framework demonstrates immediate relevance for molecular modeling and drug discovery, where graph-structured data dominates and interpretation of model predictions directly impacts research outcomes. Industries relying on black-box models for critical decisions—from credit scoring to autonomous systems—could benefit from faster, more reliable attribution methods. The deterministic recovery of Shapley indices without additional variance represents a meaningful advance over probabilistic alternatives.
Looking forward, the generalization of this approach beyond graph-structured inputs and integration with real-time model deployment pipelines will determine broader adoption. The work also opens questions about computational trade-offs between surrogate training time and downstream interpretation speed, particularly as models and datasets continue scaling.
- →TN-SHAP-G uses tensor networks to efficiently compute Shapley values for graph-structured data without Monte Carlo variance
- →The framework scales to larger graphs where sampling-based attribution methods become computationally infeasible
- →Learned surrogates require only a small number of initial model queries, reducing overall computational cost significantly
- →Experimental validation on molecular benchmarks shows accuracy matching exact Shapley values on small graphs
- →The approach enables deterministic recovery of first- and higher-order interaction indices for model interpretation