y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#multi-hop-qa News & Analysis

3 articles tagged with #multi-hop-qa. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

3 articles
AINeutralarXiv – CS AI · 5d ago6/10
🧠

A Fixed-Budget, Cluster-Aware Standard for LLM-as-a-Judge Evaluation: A Multi-Hop RAG Stress Test

Researchers propose a standardized measurement protocol for evaluating retrieval-augmented generation (RAG) systems using LLM judges, addressing inconsistencies in how semantic search quality is assessed. The standard fixes key variables like evidence budget and prompt while requiring cluster-aware statistical testing, revealing that previous comparisons may have overstated progress and that traditional BM25 retrieval outperforms pure semantic methods under controlled conditions.

AINeutralarXiv – CS AI · May 96/10
🧠

Inference-Time Budget Control for LLM Search Agents

Researchers propose a two-stage inference-time budget control system for LLM search agents that optimizes how language models allocate computational resources between tool calls and token generation during multi-hop question answering. The method uses Value-of-Information scoring to decide when to retrieve information, decompose questions, or commit to final answers, demonstrating consistent performance gains across multiple benchmarks and model sizes.

AIBullisharXiv – CS AI · Mar 26/1012
🧠

Democratizing GraphRAG: Linear, CPU-Only Graph Retrieval for Multi-Hop QA

Researchers present SPRIG, a CPU-only GraphRAG system that eliminates expensive LLM-based graph construction and GPU requirements for multi-hop question answering. The system uses lightweight NER-driven co-occurrence graphs with Personalized PageRank, achieving comparable performance while reducing computational costs by 28%.