AINeutralarXiv – CS AI · 3h ago6/10
🧠
Clark Hash: Stateless Sparse Johnson-Lindenstrauss Quantization for Neural Embeddings
Clark Hash is a new compression codec that reduces neural embedding storage from 1,536 bytes to 48 bytes (32x compression) using deterministic sparse Johnson-Lindenstrauss projection and scalar quantization. The method requires no training, learned codebooks, or corpus statistics, achieving 0.91+ correlation with dense cosine similarity scores on multilingual sentence-embedding benchmarks.