AINeutralarXiv – CS AI · 7h ago6/10
🧠
Self-Conditioned Positional HNSW for Overlap-Aware Retrieval in Chunked-Document RAG Systems: Method and Industrial Evidence-Quality Audit
Researchers propose Self-Conditioned Positional HNSW (SCP-HNSW), a method to improve retrieval-augmented generation (RAG) systems by reducing redundant overlapping chunks in document retrieval. The approach adds positional codes to embeddings and implements a two-pass query procedure, validated through 770 text-evidence reviews and 70 OCR audits showing varying quality levels across different document types.