#rag News & Analysis

95 articles tagged with #rag. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

95 articles

AIBullisharXiv – CS AI · Jun 257/10

🧠

The Hitchhiker's Guide to Agentic AI: From Foundations to Systems

A comprehensive practitioner's reference guide on agentic AI systems has been announced, covering the complete stack from LLM foundations through production deployment. The work systematizes knowledge across transformer architecture, alignment techniques, retrieval systems, multi-agent coordination, and deployment frameworks—establishing agentic AI as a mature field requiring integrated understanding across all technical layers.

AIBullisharXiv – CS AI · Jun 107/10

🧠

RAG over Thinking Traces Can Improve Reasoning Tasks

Researchers demonstrate that retrieval-augmented generation (RAG) significantly improves reasoning-intensive tasks by retrieving intermediate thinking traces rather than standard documents. The T3 method transforms these traces into structured representations, achieving 56.3% relative performance gains on AIME mathematics benchmarks and consistent improvements across multiple AI models and benchmarks.

🧠 GPT-5🧠 Gemini

AIBullisharXiv – CS AI · Jun 57/10

🧠

Reducing Hallucinations in Complex Question Answering using Simple Graph-based Retrieval-Augmented Generation (long version)

Researchers present a graph-based retrieval-augmented generation (RAG) system that reduces AI hallucinations by integrating lightweight graph structures with vector search tools. Testing on Wikipedia QA benchmarks shows the approach halves hallucinated answers while improving factual precision and recall with minimal token overhead.

AIBullisharXiv – CS AI · Jun 57/10

🧠

FIDES: Faithful Inference via Deep Evidence Signals for Retrieval-Memory Conflict in RAG

FIDES is a training-free decoder that improves how language models handle conflicts between retrieved evidence and internal knowledge by applying selective, token-level corrections rather than uniform adjustments. The method achieves up to 92-94% context fidelity across multiple model scales, demonstrating that targeted intervention at critical decoding points outperforms existing contrastive decoding approaches.

AIBullisharXiv – CS AI · Jun 27/10

🧠

Bridging Requirements and Architecture: Multi-Agent Orchestration with External Knowledge and Hierarchical Memory

Researchers propose MAAD (Multi-Agent Architecture Design), a framework using orchestrated AI agents with external knowledge and hierarchical memory to automate software architecture design from requirements. The system outperforms existing approaches and demonstrates that advanced LLMs significantly improve architectural quality and validation efficiency.

🧠 GPT-5

AIBullisharXiv – CS AI · Jun 27/10

🧠

MemGraphRAG: Memory-based Multi-Agent System for Graph Retrieval-Augmented Generation

Researchers introduce MemGraphRAG, a memory-based multi-agent system that improves graph-based retrieval-augmented generation by maintaining global context across document corpora. The framework addresses limitations in existing GraphRAG methods by resolving logical conflicts and maintaining structural consistency, demonstrating superior performance on multiple benchmarks.

AIBullisharXiv – CS AI · May 297/10

🧠

OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources

OmniRetrieval is a new framework that enables unified retrieval across heterogeneous knowledge sources—including unstructured text, relational databases, knowledge graphs, and property graphs—by translating natural language queries into source-native queries rather than forcing all data into a homogenized format. The system demonstrates superior performance compared to single-source retrievers across 13 datasets and 309 knowledge bases, positioning it as a general-purpose interface that preserves the structural advantages of each knowledge source.

AIBullisharXiv – CS AI · May 287/10

🧠

RAG-Coding: Enhancing LLM Medical Coding with Structured External Knowledge

Researchers introduce RAG-Coding, an AI system using multiple LLM agents enhanced with retrieval-augmented generation to automate ICD-10-CM medical coding. The method outperforms baseline LLM approaches by 8-13% in accuracy and maintains clinical compliance by grounding decisions in official coding guidelines, while a newly released updated dataset enables evaluation against 2025 standards.

AINeutralarXiv – CS AI · May 47/10

🧠

LLM-Oriented Information Retrieval: A Denoising-First Perspective

Researchers propose that information retrieval for LLMs requires a fundamental shift toward denoising—prioritizing signal quality over quantity—because unlike humans, language models are vulnerable to hallucinations when processing noisy or irrelevant data within limited context windows. The paper introduces a four-stage framework addressing IR challenges from inaccessibility to unverifiability, with practical applications across RAG systems, coding agents, and multimodal understanding.

AIBullisharXiv – CS AI · Apr 147/10

🧠

Retrieval as Generation: A Unified Framework with Self-Triggered Information Planning

Researchers introduce GRIP, a unified framework that integrates retrieval decisions directly into language model generation through control tokens, eliminating the need for external retrieval controllers. The system enables models to autonomously decide when to retrieve information, reformulate queries, and terminate retrieval within a single autoregressive process, achieving competitive performance with GPT-4o while using substantially fewer parameters.

🧠 GPT-4

AINeutralarXiv – CS AI · Apr 107/10

🧠

ATANT: An Evaluation Framework for AI Continuity

Researchers introduce ATANT, an open evaluation framework designed to measure whether AI systems can maintain coherent context and continuity across time without confusing information across different narratives. The framework achieves up to 100% accuracy in isolated scenarios but drops to 96% when managing 250 simultaneous narratives, revealing practical limitations in current AI memory architectures.

AIBullisharXiv – CS AI · Apr 77/10

🧠

PassiveQA: A Three-Action Framework for Epistemically Calibrated Question Answering via Supervised Finetuning

Researchers propose PassiveQA, a new AI framework that teaches language models to recognize when they don't have enough information to answer questions, choosing to ask for clarification or abstain rather than hallucinate responses. The three-action system (Answer, Ask, Abstain) uses supervised fine-tuning to align model behavior with information sufficiency, showing significant improvements in reducing hallucinations.

AIBullisharXiv – CS AI · Apr 77/10

🧠

Beyond Retrieval: Modeling Confidence Decay and Deterministic Agentic Platforms in Generative Engine Optimization

Researchers propose a new approach to Generative Engine Optimization (GEO) that moves beyond current RAG-based systems to deterministic multi-agent platforms. The study introduces mathematical models for confidence decay in LLMs and demonstrates near-zero hallucination rates through specialized agent routing in industrial applications.

AINeutralarXiv – CS AI · Apr 67/10

🧠

Beyond the Parameters: A Technical Survey of Contextual Enrichment in Large Language Models: From In-Context Prompting to Causal Retrieval-Augmented Generation

Researchers published a comprehensive technical survey on Large Language Model augmentation strategies, examining methods from in-context learning to advanced Retrieval-Augmented Generation techniques. The study provides a unified framework for understanding how structured context at inference time can overcome LLMs' limitations of static knowledge and finite context windows.

AIBearisharXiv – CS AI · Mar 277/10

🧠

Epistemic Bias Injection: Biasing LLMs via Selective Context Retrieval

Researchers have identified a new attack vector called Epistemic Bias Injection (EBI) that manipulates AI language models by injecting factually correct but biased content into retrieval-augmented generation databases. The attack steers model outputs toward specific viewpoints while evading traditional detection methods, though a new defense mechanism called BiasDef shows promise in mitigating these threats.

AIBullisharXiv – CS AI · Mar 277/10

🧠

Training the Knowledge Base through Evidence Distillation and Write-Back Enrichment

Researchers introduce WriteBack-RAG, a framework that treats knowledge bases in retrieval-augmented generation systems as trainable components rather than static databases. The method distills relevant information from documents into compact knowledge units, improving RAG performance across multiple benchmarks by an average of +2.14%.

AIBullisharXiv – CS AI · Mar 177/10

🧠

$p^2$RAG: Privacy-Preserving RAG Service Supporting Arbitrary Top-$k$ Retrieval

Researchers propose p²RAG, a new privacy-preserving Retrieval-Augmented Generation system that supports arbitrary top-k retrieval while being 3-300x faster than existing solutions. The system uses an interactive bisection method instead of sorting and employs secret sharing across two servers to protect user prompts and database content.

$RAG

AINeutralarXiv – CS AI · Mar 177/10

🧠

Agentic AI, Retrieval-Augmented Generation, and the Institutional Turn: Legal Architectures and Financial Governance in the Age of Distributional AGI

This research paper examines how agentic AI systems that can act autonomously challenge existing legal and financial regulatory frameworks. The authors argue that AI governance must shift from model-level alignment to institutional governance structures that create compliant behavior through mechanism design and runtime constraints.

AIBullisharXiv – CS AI · Mar 177/10

🧠

APEX-Searcher: Augmenting LLMs' Search Capabilities through Agentic Planning and Execution

Researchers introduce APEX-Searcher, a new framework that enhances large language models' search capabilities through a two-stage approach combining reinforcement learning for strategic planning and supervised fine-tuning for execution. The system addresses limitations in multi-hop question answering by decoupling retrieval processes into planning and execution phases, showing significant improvements across multiple benchmarks.

AIBullisharXiv – CS AI · Mar 117/10

🧠

MMGraphRAG: Bridging Vision and Language with Interpretable Multimodal Knowledge Graphs

Researchers introduce MMGraphRAG, a new AI framework that addresses hallucination issues in large language models by integrating visual scene graphs with text knowledge graphs through cross-modal fusion. The system uses SpecLink for entity linking and demonstrates superior performance in multimodal information processing across multiple benchmarks.

AINeutralarXiv – CS AI · Mar 57/10

🧠

RAG-X: Systematic Diagnosis of Retrieval-Augmented Generation for Medical Question Answering

Researchers propose RAG-X, a diagnostic framework for evaluating retrieval-augmented generation systems in medical AI applications. The study reveals an 'Accuracy Fallacy' showing a 14% gap between perceived system success and actual evidence-based grounding in medical question-answering systems.

AIBullisharXiv – CS AI · Mar 56/10

🧠

From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

Researchers developed MA-RAG, a Multi-Round Agentic RAG framework that improves medical AI reasoning by iteratively refining responses through conflict detection and external evidence retrieval. The system achieved a substantial +6.8 point accuracy improvement over baseline models across 7 medical Q&A benchmarks by addressing hallucinations and outdated knowledge in healthcare AI applications.

AIBullisharXiv – CS AI · Mar 56/10

🧠

From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems

Researchers demonstrate that coreference resolution significantly improves Retrieval-Augmented Generation (RAG) systems by reducing ambiguity in document retrieval and enhancing question-answering performance. The study finds that smaller language models benefit more from disambiguation processes, with mean pooling strategies showing superior context capturing after coreference resolution.

AIBullisharXiv – CS AI · Mar 56/10

🧠

OSCAR: Online Soft Compression And Reranking

Researchers introduce OSCAR, a new query-dependent online soft compression method for Retrieval-Augmented Generation (RAG) systems that reduces computational overhead while maintaining performance. The method achieves 2-5x speed improvements in inference with minimal accuracy loss across LLMs from 1B to 24B parameters.

🏢 Hugging Face

AIBullisharXiv – CS AI · Mar 47/102

🧠

Saarthi for AGI: Towards Domain-Specific General Intelligence for Formal Verification

Researchers have enhanced the Saarthi AI framework for formal verification, achieving 70% better accuracy in generating SystemVerilog assertions and 50% fewer iterations to reach coverage closure. The framework uses multi-agent collaboration and improved RAG techniques to move toward domain-specific AI intelligence for verification tasks.

Page 1 of 4Next →