AINeutralarXiv – CS AI · 14h ago6/10
🧠
Latent Terms: Dense Retrievers Contain Trivially Extractable BM25-ready Zipfian Vocabularies
Researchers demonstrate that dense neural retrievers contain extractable sparse features matching BM25-ready vocabularies without specialized training. Sparse Autoencoders can decompose frozen dense retrievers into classical sparse retrieval components, achieving competitive or superior performance to single-vector methods while requiring no retrieval-specific supervision.