AIBullisharXiv – CS AI · 18h ago6/10
🧠
Correcting Mean Bias in Text Embeddings: A Refined Renormalization with Training-Free Improvements on MMTEB
Researchers identify a systematic mean bias in sentence-embedding models where all embeddings share a near-identical mean component. They propose two training-free corrections, with the projection-based method (R2) demonstrating consistent improvements across 38 models on MMTEB benchmarks by better canceling mean-estimation errors than direct subtraction.