The Pok\'emon Theorem and other Fairness Impossibility Results
Researchers demonstrate that multiple fairness impossibility results in machine learning share a common geometric structure rooted in RKHS theory, proving that fairness criteria become mathematically incompatible when base rates differ across groups. The work introduces the 'Pokémon theorem' showing any finite collection of linear fairness constraints leaves residual violations, with implications for fair AI systems in high-stakes applications.
This theoretical computer science paper addresses a fundamental problem in fair machine learning: the mathematical incompatibility of fairness objectives. The authors prove that several seemingly disparate impossibility results—including the Kleinberg-Mullainathan-Raghavan dichotomy—share identical geometric roots in reproducing kernel Hilbert space (RKHS) theory. When demographic groups have unequal base rates, fairness constraints become overdetermined, making simultaneous satisfaction impossible. The research extends beyond classical impossibility results by establishing the 'Pokémon theorem,' which demonstrates that for any finite set of linear mean-fairness criteria, at least one group pair will exhibit residual violations measurable by maximum mean discrepancy (MMD). These violations decay predictably at the Kolmogorov m-width rate under spectral regularity conditions. The work also proves that fair representation learning faces similar constraints: attempting to achieve both parity and class-conditional separation simultaneously forces class collapse when base rates differ. The practical contribution lies in their approximation framework, which establishes signal-error frontiers enabling practitioners to explicitly trade fairness compliance against real-world estimator performance. Experimental validation on standard fairness benchmarks confirms theoretical predictions. This research provides essential grounding for practitioners and policymakers deploying AI systems in sensitive domains like hiring, lending, and criminal justice. Rather than pursuing elusive perfect fairness, stakeholders can now quantify unavoidable trade-offs and make informed design decisions aligned with their priorities and constraints.
- →Fairness impossibility results share common RKHS geometry where unequal base rates overdetermine linear constraints on conditional mean embeddings
- →The Pokémon theorem proves any finite collection of linear fairness criteria leaves measurable residual violations in at least one group pair
- →Fair representation learning forcing both parity and class-conditional separation simultaneously causes class collapse under demographic imbalance
- →Theoretical bounds enable explicit trade-off quantification between fairness compliance and estimator performance in real-world applications
- →Results validated on standard fairness benchmarks provide mathematical grounding for fair AI system design decisions