AIBearisharXiv – CS AI · 10h ago7/10
🧠
The Unseen Hand: Manipulating Model Fairness and SHAP with Targeted Identity Re-Association Attacks
Researchers have discovered a new class of attacks called Targeted Identity Re-Association (TIRA) that can manipulate machine learning fairness audits and SHAP explainability tools without leaving detectable traces. The attacks use probabilistic output manipulation techniques to mask the influence of protected features, demonstrating that critical AI accountability mechanisms are vulnerable to sophisticated gaming.