A Fiber Criterion for Representation Identifiability in Supervised Learning
A new theoretical framework formalizes when representation properties in supervised learning can be uniquely identified from input-output behavior alone. The research demonstrates that representation-level claims require additional assumptions beyond predictive performance, as auxiliary information can be added to representations while preserving predictor outputs, fundamentally challenging common assumptions about what supervised learning actually determines.
This paper addresses a foundational problem in machine learning theory: the gap between what supervised learning can actually identify versus what researchers assume it identifies. When machine learning models decompose into a representation function and a classifier head, supervised evidence only constrains their composition, not the individual components. The authors formalize this through fiber theory, showing that representation properties are identifiable only when they remain constant across all representation-head pairs producing identical predictions.
The work emerges from growing recognition that representation learning makes implicit assumptions often invisible to practitioners. As deep learning increasingly dominates applications from vision to language models, understanding what properties are actually determined by data becomes critical. The paper's predictor-preserving augmentation construction provides concrete examples: auxiliary information can be embedded in representations without affecting predictions, yet fundamentally alter properties like minimality, compression, invariance, and semantic accessibility.
For the machine learning community, this research has significant implications. It suggests that claims about learned representations—whether regarding fairness, interpretability, or robustness—require explicit justification beyond supervised performance metrics. Current practice often treats representations as uniquely determined when they are merely consistent with observations. The Waterbirds experiments demonstrate that different architectural constraints can select different representations achieving similar accuracy, highlighting the arbitrary nature of representation selection under supervised learning alone.
This work strengthens the theoretical foundations necessary for responsible AI development. As representations increasingly inform high-stakes decisions, understanding their identifiability becomes essential for transparency and trustworthiness. Future research must specify additional criteria—architectural assumptions, regularization, or measurement procedures—to make representation-level claims scientifically defensible.
- →Supervised learning constrains composite predictors but does not uniquely determine representation-head factorizations, creating fundamental identifiability gaps.
- →Auxiliary information can be added to representations while preserving predictor outputs, making common representation properties non-identifiable from supervised evidence alone.
- →Representation-level claims about compression, invariance, or fairness require explicit assumptions beyond predictive performance to be scientifically justified.
- →Finite-sample diagnostics alone cannot establish which representation properties are truly determined by data versus artifacts of optimization choices.
- →Additional inductive biases, architectural constraints, or measurement protocols are necessary to make representation identifiability claims rigorous and defensible.