AINeutralarXiv – CS AI · 8h ago6/10
🧠
QO-Bench: Diagnosing Query-Operator-Preserving Retrieval over Typed Event Tuples
Researchers introduce QO-Bench, a diagnostic benchmark for evaluating retrieval-augmented generation (RAG) systems on structured database-style queries over text. The benchmark reveals that current RAG systems excel at finding relevant passages but fail to preserve typed values needed for query operators like joins and counting, identifying operator execution rather than retrieval as the core bottleneck.