The Loss Is Not Enough: Sampling Conditions and Inductive Bias in Contrastive Representation Learning
Researchers develop a theoretical framework proving that contrastive learning—a dominant self-supervised AI technique—requires specific sampling diversity conditions to recover meaningful latent geometry. They demonstrate that standard approaches can learn non-orthogonal representations and propose a corrected InfoNCE variant, with experiments showing that architectural inductive bias becomes critical when sampling diversity is limited.