CausShield: Sample Reconstruction-Resilient Vertical FL via Causal Representation Learning
CausShield is a new defense mechanism for vertical federated learning that uses causal representation learning to protect against sample reconstruction attacks while maintaining model performance. The approach decomposes shared representations into task-relevant and task-irrelevant components, achieving better privacy-utility tradeoffs than existing defenses through unsupervised learning rather than supervised training.
CausShield addresses a critical vulnerability in vertical federated learning systems where multiple parties collaborate using vertically partitioned data. While VFL enables distributed learning without sharing raw samples, it remains susceptible to active reconstruction attacks that can compromise privacy. The research applies structural causal models to distinguish between features necessary for task performance and those that primarily encode private information, providing both theoretical justification and practical defense mechanisms.
The significance of this work stems from the fundamental tension in privacy-preserving machine learning: defenses that suppress information indiscriminately harm model utility, while those relying on end-to-end supervised training expose vulnerabilities during early training epochs. CausShield navigates this tradeoff through unsupervised representation learning, which avoids the supervised training bottleneck that creates early-epoch exposure windows. The researchers establish theoretical convergence guarantees, proving that privacy-preserving decomposition doesn't compromise standard VFL convergence properties.
For the federated learning and privacy-tech communities, this represents meaningful progress in practical deployment of privacy-preserving systems. The experimental evaluation against state-of-the-art defenses including recent attacks from NDSS'25 and USENIX Security'25 demonstrates competitive advantages across multiple dimensions: privacy protection, model utility, and computational efficiency. This matters to organizations implementing federated learning across industries like healthcare, finance, and collaborative AI development.
The work suggests federated learning systems can achieve stronger privacy guarantees without significant performance penalties, potentially accelerating adoption in sensitive domains. Future work likely focuses on scaling these mechanisms to larger feature sets and exploring adaptive defenses that respond to evolving attack strategies.
- βCausShield uses causal representation learning to separate task-relevant from task-irrelevant features in federated learning, enabling targeted privacy protection without suppressing utility-critical information
- βUnsupervised decomposition approach eliminates early-epoch vulnerability windows inherent in supervised training-based defenses
- βTheoretical analysis proves the mechanism preserves standard VFL convergence properties while providing privacy guarantees
- βExperimental results show consistent improvements over seven state-of-the-art defenses including recent advanced reconstruction attacks
- βThe approach has practical implications for deploying privacy-preserving federated learning in sensitive industries like healthcare and finance