AINeutralarXiv – CS AI · 7h ago6/10
🧠
Fair Finetuning Mitigates Distribution Inference Attacks
Researchers introduce Fair Fine-tuning (FFt), a defense mechanism that combines fairness constraints with model fine-tuning to mitigate distribution inference attacks, where adversaries infer sensitive demographic information from machine learning models. The approach reduces adversarial accuracy gaps from ~15% to under 4% across multiple datasets while providing formal theoretical guarantees linking fairness metrics to privacy protection.
🏢 Meta