y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#retrieval-augmented-generation News & Analysis

98 articles tagged with #retrieval-augmented-generation. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

98 articles
AIBullisharXiv – CS AI · 2d ago7/10
🧠

Less Is More: Elevating RAG via Performance-Driven Context Compression

Researchers introduce CORE-RAG, a novel framework that compresses context in Retrieval-Augmented Generation systems using performance-driven learning rather than predefined heuristics. The approach achieves a 97% compression ratio while improving accuracy by 3.3 points on exact match scores, addressing a critical bottleneck in LLM efficiency.

AIBullisharXiv – CS AI · 3d ago7/10
🧠

Grounded Cache Routing for Retrieval-Augmented Generation: When Is It Safe to Reuse an Answer?

GroundedCache proposes a safety-first framework for reusing cached answers in retrieval-augmented generation systems by validating four conditions before serving cached responses. The system achieves near-zero unsafe-served rates (0-1.5%) across benchmarks while maintaining minimal latency overhead, addressing critical vulnerabilities in current caching approaches that can serve incorrect answers.

AIBullisharXiv – CS AI · 3d ago7/10
🧠

Plan Before Search: Search Agents Need Plan

Researchers demonstrate that large language models trained as retrieval-augmented agents benefit from explicit planning—decomposing questions into ordered sub-questions before searching—rather than reactive document-driven responses. They introduce a self-bootstrapping training paradigm that enables smaller seed models to generate filtered trajectories activating this planning behavior across different model sizes without requiring distillation from larger external models.

AIBullisharXiv – CS AI · 3d ago7/10
🧠

FD-RAG: Federated Dual-System Retrieval-Augmented Generation

FD-RAG introduces a federated framework for retrieval-augmented generation that enables decentralized LLM deployment across edge devices without centralizing sensitive data. The system achieves 7.8% accuracy improvements and 8.4x latency reductions by splitting lightweight memory access from expensive LLM reasoning, while aggregating anonymized knowledge across fragmented device networks.

AIBearisharXiv – CS AI · 4d ago7/10
🧠

Detecting Is Not Resolving: The Monitoring Control Gap in Retrieval Augmented LLMs

Researchers discovered that retrieval-augmented language models exhibit a critical safety gap: they can detect contradictory information in accumulated evidence but fail to incorporate this awareness into their final recommendations. Testing across model families showed single-turn safety evaluations significantly overestimate real-world robustness in multi-turn scenarios where evidence accumulates.

AIBearisharXiv – CS AI · 4d ago7/10
🧠

The Attribution Blind Spot: Detecting When Language Models Rely on Memory Rather Than Retrieved Context

Researchers identify a critical vulnerability in retrieval-augmented generation systems where language models produce faithful-looking outputs from memory rather than retrieved context, making it impossible to verify source attribution through output analysis alone. They propose Computational Reality Monitoring (CRM), a technique that detects internal representational differences to identify when models rely on pretraining data versus external evidence.

AIBullisharXiv – CS AI · May 127/10
🧠

VLADriver-RAG: Retrieval-Augmented Vision-Language-Action Models for Autonomous Driving

Researchers introduce VLADriver-RAG, a new framework that combines Vision-Language-Action models with retrieval-augmented generation for autonomous driving. By grounding decisions in explicit historical knowledge rather than relying solely on learned parameters, the system achieves state-of-the-art performance on the Bench2Drive benchmark with a Driving Score of 89.12, demonstrating improved generalization in complex driving scenarios.

AIBullisharXiv – CS AI · May 117/10
🧠

LARAG: Link-Aware Retrieval Strategy for RAG Systems in Hyperlinked Technical Documentation

LARAG introduces a link-aware retrieval strategy that improves RAG systems by leveraging hyperlink structures already present in technical documentation, rather than treating documents as flat text collections. The approach achieves better answer quality with fewer computational resources, demonstrating that implicit graph-like retrieval through existing metadata can enhance AI system performance.

AIBearisharXiv – CS AI · May 97/10
🧠

LeakDojo: Decoding the Leakage Threats of RAG Systems

LeakDojo is a new research framework that systematically evaluates security vulnerabilities in Retrieval-Augmented Generation (RAG) systems, revealing that stronger LLM instruction-following capabilities correlate with higher data leakage risks. The study benchmarks six attack methods across multiple LLMs and datasets, providing critical insights into how RAG databases can be exploited and suggesting that improvements in RAG faithfulness may paradoxically increase security vulnerabilities.

AIBullisharXiv – CS AI · May 47/10
🧠

Lightweight Domain Adaptation of a Large Language Model for Legal Assistance in the Indian Context

Researchers developed Legal Assist AI, a framework using an 8-billion-parameter Llama 3.1 model enhanced with Retrieval-Augmented Generation to provide legal assistance tailored to Indian law. The system achieved 60.08% on the All-India Bar Examination benchmark, outperforming OpenAI's 175-billion-parameter GPT-3.5 Turbo while being 22 times more parameter-efficient.

🧠 Llama
AINeutralarXiv – CS AI · May 47/10
🧠

LLM-Oriented Information Retrieval: A Denoising-First Perspective

Researchers propose that information retrieval for LLMs requires a fundamental shift toward denoising—prioritizing signal quality over quantity—because unlike humans, language models are vulnerable to hallucinations when processing noisy or irrelevant data within limited context windows. The paper introduces a four-stage framework addressing IR challenges from inaccessibility to unverifiability, with practical applications across RAG systems, coding agents, and multimodal understanding.

AIBearisharXiv – CS AI · May 17/10
🧠

Contextual Agentic Memory is a Memo, Not True Memory

Researchers argue that current AI agent memory systems (vector stores, RAG, scratchpads) perform lookup operations rather than true memory consolidation, causing agents to accumulate indefinite notes without developing expertise, hit a generalization ceiling on novel tasks, and remain vulnerable to persistent memory poisoning attacks. The paper draws on neuroscience's Complementary Learning Systems theory to show biological intelligence pairs fast exemplar storage with slow weight consolidation—a dual mechanism current AI systems lack.

AIBullisharXiv – CS AI · May 17/10
🧠

NeocorRAG: Less Irrelevant Information, More Explicit Evidence, and More Effective Recall via Evidence Chains

Researchers introduce NeocorRAG, a new framework that optimizes retrieval quality in Retrieval-Augmented Generation (RAG) systems by using Evidence Chains, achieving state-of-the-art performance while reducing token consumption by 80% compared to comparable methods. The framework addresses a critical gap where improvements in retrieval metrics don't consistently translate to better reasoning accuracy.

AIBullisharXiv – CS AI · Apr 147/10
🧠

AtlasKV: Augmenting LLMs with Billion-Scale Knowledge Graphs in 20GB VRAM

Researchers introduce AtlasKV, a parametric knowledge integration method that enables large language models to leverage billion-scale knowledge graphs while consuming less than 20GB of VRAM. Unlike traditional retrieval-augmented generation (RAG) approaches, AtlasKV integrates knowledge directly into LLM parameters without requiring external retrievers or extended context windows, reducing inference latency and computational overhead.

AIBullisharXiv – CS AI · Apr 147/10
🧠

Disco-RAG: Discourse-Aware Retrieval-Augmented Generation

Researchers introduce Disco-RAG, a discourse-aware framework that enhances Retrieval-Augmented Generation (RAG) systems by explicitly modeling discourse structures and rhetorical relationships between retrieved passages. The method achieves state-of-the-art results on question answering and summarization tasks without fine-tuning, demonstrating that structural understanding of text significantly improves LLM performance on knowledge-intensive tasks.

AIBullisharXiv – CS AI · Apr 147/10
🧠

Retrieval as Generation: A Unified Framework with Self-Triggered Information Planning

Researchers introduce GRIP, a unified framework that integrates retrieval decisions directly into language model generation through control tokens, eliminating the need for external retrieval controllers. The system enables models to autonomously decide when to retrieve information, reformulate queries, and terminate retrieval within a single autoregressive process, achieving competitive performance with GPT-4o while using substantially fewer parameters.

🧠 GPT-4
AINeutralarXiv – CS AI · Apr 67/10
🧠

Beyond the Parameters: A Technical Survey of Contextual Enrichment in Large Language Models: From In-Context Prompting to Causal Retrieval-Augmented Generation

Researchers published a comprehensive technical survey on Large Language Model augmentation strategies, examining methods from in-context learning to advanced Retrieval-Augmented Generation techniques. The study provides a unified framework for understanding how structured context at inference time can overcome LLMs' limitations of static knowledge and finite context windows.

AIBullisharXiv – CS AI · Mar 277/10
🧠

Training the Knowledge Base through Evidence Distillation and Write-Back Enrichment

Researchers introduce WriteBack-RAG, a framework that treats knowledge bases in retrieval-augmented generation systems as trainable components rather than static databases. The method distills relevant information from documents into compact knowledge units, improving RAG performance across multiple benchmarks by an average of +2.14%.

AIBearisharXiv – CS AI · Mar 277/10
🧠

Epistemic Bias Injection: Biasing LLMs via Selective Context Retrieval

Researchers have identified a new attack vector called Epistemic Bias Injection (EBI) that manipulates AI language models by injecting factually correct but biased content into retrieval-augmented generation databases. The attack steers model outputs toward specific viewpoints while evading traditional detection methods, though a new defense mechanism called BiasDef shows promise in mitigating these threats.

AIBullisharXiv – CS AI · Mar 177/10
🧠

$p^2$RAG: Privacy-Preserving RAG Service Supporting Arbitrary Top-$k$ Retrieval

Researchers propose p²RAG, a new privacy-preserving Retrieval-Augmented Generation system that supports arbitrary top-k retrieval while being 3-300x faster than existing solutions. The system uses an interactive bisection method instead of sorting and employs secret sharing across two servers to protect user prompts and database content.

$RAG
AIBullisharXiv – CS AI · Mar 177/10
🧠

APEX-Searcher: Augmenting LLMs' Search Capabilities through Agentic Planning and Execution

Researchers introduce APEX-Searcher, a new framework that enhances large language models' search capabilities through a two-stage approach combining reinforcement learning for strategic planning and supervised fine-tuning for execution. The system addresses limitations in multi-hop question answering by decoupling retrieval processes into planning and execution phases, showing significant improvements across multiple benchmarks.

AIBullisharXiv – CS AI · Mar 97/10
🧠

RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model

Researchers introduce RAG-Driver, a retrieval-augmented multi-modal large language model designed for autonomous driving that can provide explainable decisions and control predictions. The system addresses data scarcity and generalization challenges in AI-driven autonomous vehicles by using in-context learning and expert demonstration retrieval.

AINeutralarXiv – CS AI · Mar 97/10
🧠

Agentic retrieval-augmented reasoning reshapes collective reliability under model variability in radiology question answering

Researchers evaluated 34 large language models on radiology questions, finding that agentic retrieval-augmented reasoning systems improve consensus and reliability across different AI models. The study shows these systems reduce decision variability between models and increase robust correctness, though 72% of incorrect outputs still carried moderate to high clinical severity.

AIBullisharXiv – CS AI · Mar 56/10
🧠

From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems

Researchers demonstrate that coreference resolution significantly improves Retrieval-Augmented Generation (RAG) systems by reducing ambiguity in document retrieval and enhancing question-answering performance. The study finds that smaller language models benefit more from disambiguation processes, with mean pooling strategies showing superior context capturing after coreference resolution.

Page 1 of 4Next →