#retrieval-augmented-generation News & Analysis

146 articles tagged with #retrieval-augmented-generation. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

146 articles

AIBullisharXiv – CS AI · Jun 257/10

🧠

To Isolate or to Score? Model-Adaptive Assessment for Cost-Efficient Multi-Agent RAG

Researchers demonstrate that multi-agent document assessment for retrieval-augmented generation (RAG) systems can be significantly optimized through model-adaptive routing rather than expensive scoring mechanisms. The study reveals that weaker models benefit primarily from document isolation rather than quality assessment, while MADARA, a proposed adaptive architecture, generalizes across different model families with zero-shot capability, reducing computational overhead.

AIBullisharXiv – CS AI · Jun 237/10

🧠

Retrieval-Augmented Anatomical Guidance for Text-to-CT Generation

Researchers propose a retrieval-augmented approach for generating CT scans from radiology reports that combines semantic control with anatomical consistency by retrieving structurally similar clinical cases and using their annotations as guidance. The method improves image fidelity and clinical consistency compared to text-only baselines while enabling spatial controllability without requiring ground-truth annotations at inference time.

AINeutralarXiv – CS AI · Jun 237/10

🧠

When Confidence Takes the Wrong Path: Diagnosing Retrieval-State Lock-In in RAG

Researchers identify 'retrieval-state lock-in,' a failure mode in retrieval-augmented generation (RAG) systems where multiple sampled answers agree despite being wrong because they condition on the same defective retrieval state. The study proposes decomposing confidence scores into three components—answer surface, evidence, and retrieval state—achieving 91.9% precision by requiring all three to agree, though this certifies only 7.7% of answers as low-risk.

AIBullisharXiv – CS AI · Jun 237/10

🧠

Only Ask What You Don't Know: Grounded Delta Planning for Efficient Multi-step RAG

Researchers introduce GDP-RAG, a novel retrieval-augmented generation framework that improves multi-hop question answering by focusing computation only on information gaps rather than over-generating reasoning steps. The system achieves 60.63% accuracy on benchmark datasets while reducing computational costs by 22-68% compared to existing approaches.

AIBullisharXiv – CS AI · Jun 197/10

🧠

Multi-Agent Transactive Memory

Researchers propose Multi-Agent Transactive Memory (MATM), a framework enabling decentralized LLM agents to share and retrieve trajectories—recorded problem-solving paths—from a shared repository. Experiments in interactive environments demonstrate that agents retrieving stored trajectories improve task performance and efficiency without requiring coordination or joint training.

AIBullisharXiv – CS AI · Jun 117/10

🧠

NightFeats @ MMU-RAGent NeurIPS 2025: A Context-Optimized Multi-Agent RAG System for the Text-to-Text Track

NightFeats, a multi-agent retrieval-augmented generation system, won Best Dynamic Evaluation at NeurIPS 2025's MMU-RAGent competition by prioritizing architectural transparency and evidence grounding over benchmark optimization. The system outperformed proprietary models like Claude-SonnetV2 and Nova-Pro through a three-phase pipeline combining retrieval, curation, and composition with explicit intermediate representations.

🧠 Claude

AIBullisharXiv – CS AI · Jun 107/10

🧠

From Volume to Value: Preference-Aligned Memory Construction for On-Device RAG

Researchers introduce EPIC, a novel approach to on-device Retrieval-Augmented Generation (RAG) that prioritizes user preferences as compact personal context while operating under strict memory constraints. The method achieves dramatic efficiency gains—reducing memory usage by 2,404x and latency by 32x—while improving preference-following accuracy by 18.79 percentage points across multiple benchmarks.

AIBullisharXiv – CS AI · Jun 107/10

🧠

One Token per Multimodal Evidence: Latent Memory for Resource-Constrained QA

Researchers introduce Latent Memory, a novel memory paradigm that compresses multimodal evidence (text and images) into single high-dimensional tokens for retrieval-augmented generation systems. The approach achieves competitive QA performance while reducing token consumption by 3-10x, addressing critical efficiency constraints in resource-limited deployments.

AIBullisharXiv – CS AI · Jun 107/10

🧠

RAG over Thinking Traces Can Improve Reasoning Tasks

Researchers demonstrate that retrieval-augmented generation (RAG) significantly improves reasoning-intensive tasks by retrieving intermediate thinking traces rather than standard documents. The T3 method transforms these traces into structured representations, achieving 56.3% relative performance gains on AIME mathematics benchmarks and consistent improvements across multiple AI models and benchmarks.

🧠 GPT-5🧠 Gemini

AIBullisharXiv – CS AI · Jun 97/10

🧠

Anything2Skill: Compiling External Knowledge into Reusable Skills for Agents

Researchers introduce Anything2Skill, a framework that converts external knowledge sources into reusable, executable skills for AI agents. By combining skill extraction with retrieval-augmented generation, the system achieves 98.85% success on command-line tasks and 94.10% on GitHub operations, significantly outperforming RAG-only approaches.

AIBearisharXiv – CS AI · Jun 97/10

🧠

Beyond Probabilistic Similarity: Structural, Temporal, and Causal Limitations of Retrieval-Augmented Generation in the Legal Domain

A research paper identifies fundamental architectural flaws in Retrieval-Augmented Generation (RAG) systems for legal AI, showing that probabilistic similarity-based retrieval cannot adequately capture the hierarchical, temporal, and causal structure inherent in legal knowledge. The authors propose a deterministic-by-design framework addressing mereological blindness, diachronic blindness, and causal opacity to prevent persistent failures like fabricated citations and anachronistic legal content.

AIBullisharXiv – CS AI · Jun 97/10

🧠

ConflictRAG: Detecting and Resolving Knowledge Conflicts in Retrieval Augmented Generation

ConflictRAG introduces a novel framework for detecting and resolving contradictory information in Retrieval-Augmented Generation systems, achieving 88.7% conflict-detection accuracy while reducing API costs by 62%. The system combines cost-efficient embedding-based detection with selective LLM refinement and demonstrates 5.3-6.1% improvements in answer correctness across multiple benchmarks.

AIBullisharXiv – CS AI · Jun 57/10

🧠

HypRAG: Hyperbolic Dense Retrieval for Retrieval Augmented Generation

Researchers introduce HypRAG, a novel dense retrieval system for retrieval-augmented generation that operates in hyperbolic space rather than traditional Euclidean space. The approach achieves up to 29% performance gains over Euclidean baselines by better preserving the hierarchical structure of natural language, reducing hallucination risks in AI systems.

AIBullisharXiv – CS AI · Jun 27/10

🧠

Language-Native Materials Processing Design by Lightly Structured Text Database and Reasoning Large Language Model

Researchers have developed an AI framework that transforms materials synthesis procedures from unstructured narrative text into actionable, computable knowledge using large language models and structured databases. The system successfully optimized boron nitride nanosheet synthesis in three iterations, demonstrating AI's potential to accelerate complex materials discovery beyond traditional trial-and-error approaches.

AIBullisharXiv – CS AI · Jun 27/10

🧠

PolarMem: A Training-Free Polarized Latent Graph Memory for Verifiable Vision-Language Models

Researchers introduce PolarMem, a training-free memory framework that enhances vision-language models by explicitly tracking what has been verified as absent or excluded, not just what is similar. The system uses a polarized graph structure with positive and negative memory relations to reduce logical contradictions and improve reasoning reliability across multiple multimodal benchmarks.

AIBearisharXiv – CS AI · Jun 27/10

🧠

Revisiting Parameter-Based Knowledge Editing in Large Language Models: Theoretical Limits and Empirical Evidence

A new study challenges the viability of parameter-based knowledge editing in large language models, revealing that localized weight modifications cause global interference and capability degradation. The research demonstrates theoretically and empirically that simple retrieval-based approaches consistently outperform all parameter-editing methods, suggesting the field needs to fundamentally reconsider its approach to updating LLM knowledge.

AIBullisharXiv – CS AI · Jun 27/10

🧠

MemGraphRAG: Memory-based Multi-Agent System for Graph Retrieval-Augmented Generation

Researchers introduce MemGraphRAG, a memory-based multi-agent system that improves graph-based retrieval-augmented generation by maintaining global context across document corpora. The framework addresses limitations in existing GraphRAG methods by resolving logical conflicts and maintaining structural consistency, demonstrating superior performance on multiple benchmarks.

AIBullisharXiv – CS AI · Jun 17/10

🧠

DynaTree: Dynamic Agentic Retrieval Tree for Time-Sensitive News Retrieval

DynaTree is a two-stage framework for efficient news retrieval that combines offline agentic reasoning with lightweight online subtree selection, achieving significant improvements in real-world deployment. The system demonstrated a 59-73% survival rate versus 32-53% for fixed approaches in production A/B testing, highlighting the practical value of persistent semantic expansion for time-sensitive information retrieval.

AINeutralarXiv – CS AI · Jun 17/10

🧠

Understanding the Fundamental Design Decisions of Retrieval-Augmented Generation Systems

A comprehensive research study reveals that Retrieval-Augmented Generation (RAG) systems require context-aware deployment strategies rather than universal approaches. The analysis across multiple LLMs and datasets shows that RAG effectiveness depends heavily on task type, with optimal retrieval volumes and knowledge integration methods varying significantly between question answering and code generation applications.

AIBullisharXiv – CS AI · May 297/10

🧠

Less Is More: Elevating RAG via Performance-Driven Context Compression

Researchers introduce CORE-RAG, a novel framework that compresses context in Retrieval-Augmented Generation systems using performance-driven learning rather than predefined heuristics. The approach achieves a 97% compression ratio while improving accuracy by 3.3 points on exact match scores, addressing a critical bottleneck in LLM efficiency.

AIBullisharXiv – CS AI · May 287/10

🧠

Grounded Cache Routing for Retrieval-Augmented Generation: When Is It Safe to Reuse an Answer?

GroundedCache proposes a safety-first framework for reusing cached answers in retrieval-augmented generation systems by validating four conditions before serving cached responses. The system achieves near-zero unsafe-served rates (0-1.5%) across benchmarks while maintaining minimal latency overhead, addressing critical vulnerabilities in current caching approaches that can serve incorrect answers.

AIBullisharXiv – CS AI · May 287/10

🧠

FD-RAG: Federated Dual-System Retrieval-Augmented Generation

FD-RAG introduces a federated framework for retrieval-augmented generation that enables decentralized LLM deployment across edge devices without centralizing sensitive data. The system achieves 7.8% accuracy improvements and 8.4x latency reductions by splitting lightweight memory access from expensive LLM reasoning, while aggregating anonymized knowledge across fragmented device networks.

AIBullisharXiv – CS AI · May 287/10

🧠

Plan Before Search: Search Agents Need Plan

Researchers demonstrate that large language models trained as retrieval-augmented agents benefit from explicit planning—decomposing questions into ordered sub-questions before searching—rather than reactive document-driven responses. They introduce a self-bootstrapping training paradigm that enables smaller seed models to generate filtered trajectories activating this planning behavior across different model sizes without requiring distillation from larger external models.

AIBearisharXiv – CS AI · May 277/10

🧠

Detecting Is Not Resolving: The Monitoring Control Gap in Retrieval Augmented LLMs

Researchers discovered that retrieval-augmented language models exhibit a critical safety gap: they can detect contradictory information in accumulated evidence but fail to incorporate this awareness into their final recommendations. Testing across model families showed single-turn safety evaluations significantly overestimate real-world robustness in multi-turn scenarios where evidence accumulates.

AIBearisharXiv – CS AI · May 277/10

🧠

The Attribution Blind Spot: Detecting When Language Models Rely on Memory Rather Than Retrieved Context

Researchers identify a critical vulnerability in retrieval-augmented generation systems where language models produce faithful-looking outputs from memory rather than retrieved context, making it impossible to verify source attribution through output analysis alone. They propose Computational Reality Monitoring (CRM), a technique that detects internal representational differences to identify when models rely on pretraining data versus external evidence.

Page 1 of 6Next →