AINeutralarXiv – CS AI · Mar 36/103
🧠
Scaling Retrieval Augmented Generation with RAG Fusion: Lessons from an Industry Deployment
Research on production RAG systems reveals that retrieval fusion techniques like multi-query retrieval and reciprocal rank fusion increase raw document recall but fail to improve end-to-end performance due to re-ranking limits and context constraints. The study found fusion variants actually decreased accuracy from 0.51 to 0.48 while adding latency overhead without corresponding benefits.