🧠 AI⚪ NeutralImportance 6/10

Assessing Per-Sample Membership Inference Vulnerability without Retraining

arXiv – CS AI|Valentin Dorseuil (DI-ENS), Jamal Atif (CMAP), Olivier Capp\'e (DI-ENS)|May 27, 2026 at 04:00 AM

🤖AI Summary

Researchers propose a novel method to assess individual training data vulnerability to membership inference attacks without requiring shadow models. The approach combines theoretical analysis in linear settings with a practical surrogate score for deep networks, using only geometry and loss information from a single trained model.

Analysis

This research addresses a critical gap in privacy assessment methodology. Membership inference attacks represent a fundamental privacy threat where adversaries determine whether specific data points were used in model training. Traditional evaluation requires training expensive shadow models that replicate the target model's training process—a computationally prohibitive approach at scale. The authors demonstrate that per-sample vulnerability stems from two factors: a point's training loss and its geometric properties within the data distribution, quantified through leverage scores.

The theoretical contribution emerges from linear model analysis, where the authors derive closed-form expressions decomposing black-box attack vulnerability into interpretable components. This mathematical foundation enables extension to modern deep networks by operating on last-layer representations, where linearity assumptions approximate reality. Their surrogate score requires only a single forward pass through an already-trained model, dramatically reducing computational overhead compared to shadow model approaches.

For the broader privacy community, this work provides a scalable diagnostic tool essential for responsible model deployment. Organizations training large language models and other high-stakes systems can now assess privacy risks during development rather than post-hoc. The method's efficiency makes privacy auditing economically feasible for resource-constrained teams.

The research also bridges theoretical privacy analysis with practical implementation, validating the surrogate score across diverse datasets and architectures. This empirical grounding strengthens confidence in the approach's generalizability. Looking forward, researchers should explore whether similar geometric insights apply to other privacy attacks and whether the framework extends to federated learning scenarios where privacy concerns intensify.

Key Takeaways

→A new method assesses per-sample membership inference vulnerability without training shadow models, significantly reducing computational costs.
→Vulnerability stems from both training loss and data-dependent geometric measures quantifiable through leverage scores.
→The approach works on deep networks by analyzing last-layer representations, requiring only a single trained model.
→Empirical evaluation shows the surrogate score outperforms simpler baselines like loss and gradient-norm at identifying high-risk training points.
→The framework enables scalable privacy auditing during model development, making privacy assessment economically feasible for organizations.