#retrieval-augmentation News & Analysis

11 articles tagged with #retrieval-augmentation. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

11 articles

AIBullisharXiv – CS AI · Jun 57/10

🧠

FIDES: Faithful Inference via Deep Evidence Signals for Retrieval-Memory Conflict in RAG

FIDES is a training-free decoder that improves how language models handle conflicts between retrieved evidence and internal knowledge by applying selective, token-level corrections rather than uniform adjustments. The method achieves up to 92-94% context fidelity across multiple model scales, demonstrating that targeted intervention at critical decoding points outperforms existing contrastive decoding approaches.

AIBearisharXiv – CS AI · May 297/10

🧠

Persona Conditioning of Brand Recommendations in Retrieval-Augmented Commercial Chat: A Prominence-Stratified Cross-Provider Audit

A comprehensive audit of three major AI models reveals that personalized user contexts significantly reshape brand recommendations in commercial AI assistants, with mid-market brands experiencing up to 75% recommendation volatility while category leaders maintain 80% consistency across personas. The study demonstrates that AI recommendation bias is strongly correlated with model architecture and retrieval strategies, with implications for fair evaluation and brand perception measurement.

🏢 OpenAI🏢 Anthropic

AINeutralarXiv – CS AI · Apr 157/10

🧠

Benchmarking Deflection and Hallucination in Large Vision-Language Models

Researchers introduce VLM-DeflectionBench, a new benchmark with 2,775 samples designed to evaluate how large vision-language models handle conflicting or insufficient evidence. The study reveals that most state-of-the-art LVLMs fail to appropriately deflect when faced with noisy or misleading information, highlighting critical gaps in model reliability for knowledge-intensive tasks.

AINeutralarXiv – CS AI · Jun 256/10

🧠

Is GraphRAG Needed? From Basic RAG to Graph-/Agentic Solutions with Context Optimization

Researchers present a comprehensive framework comparing RAG (Retrieval-Augmented Generation) variants—including GraphRAG, Modular RAG, and Agentic RAG—across 9 standardized scenarios. They introduce a novel context optimization method that reduces token usage by 19-53% while identifying a retrieval-generation gap suggesting advanced retrieval methods may not proportionally improve output quality.

AINeutralarXiv – CS AI · Jun 196/10

🧠

SIGMA: Search-Augmented On-Demand Knowledge Integration for Agentic Mathematical Reasoning

Researchers introduce SIGMA, a multi-agent framework that enhances mathematical reasoning by orchestrating specialized agents to perform targeted searches and synthesize information through a moderator mechanism. The system achieves a 7.4% absolute performance improvement over existing models on challenging benchmarks like MATH500 and AIME, demonstrating that on-demand, context-sensitive knowledge integration significantly advances complex problem-solving capabilities.

AINeutralarXiv – CS AI · Jun 96/10

🧠

See More, Think Deeper: Query-Expanded Visual Evidence and Answer-Clue Guided Reflection for Long Video Understanding

Researchers introduce CoVER, a new framework for Video Large Language Models that improves long-video understanding by gathering multiple search queries for visual evidence and using answer-specific visual feedback for verification. The approach demonstrates superior performance compared to similarly-sized models and some closed-source alternatives.

AINeutralarXiv – CS AI · Jun 36/10

🧠

Traj-Evolve: A Self-Evolving Multi-Agent System for Patient Trajectory Modeling in Lung Cancer Early Detection

Traj-Evolve introduces a self-evolving multi-agent system that models patient trajectories from longitudinal electronic health records for lung cancer early detection. The system combines an Experience Pool for retrieval-augmented few-shot learning with multi-agent reinforcement learning to optimize collaboration, outperforming nine baselines on both general and never-smoker populations.

AIBullisharXiv – CS AI · Jun 26/10

🧠

Critic-R: Improving Agentic Search using Instruction-tuned Retrievers with Natural Language Introspective Feedback

Researchers introduce Critic-R, a framework that improves agentic search systems by creating a feedback loop between reasoning agents and retrieval models. The approach uses a critic model to evaluate whether retrieved context supports reasoning steps and includes two mechanisms: Critic-R-Zero for query refinement at inference time, and Critic-Embed for training retrievers without manual annotations, demonstrating significant improvements on multi-hop question-answering benchmarks.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Learning World Models for Interactive Video Generation

Researchers propose Video Retrieval Augmented Generation (VRAG) to address fundamental challenges in interactive world models for long-form video generation, specifically tackling compounding errors and spatiotemporal incoherence. The work establishes that autoregressive video generation inherently struggles with error accumulation, while explicit global state conditioning significantly improves long-term consistency and interactive planning capabilities.

AINeutralarXiv – CS AI · Mar 116/10

🧠

Quantifying the Accuracy and Cost Impact of Design Decisions in Budget-Constrained Agentic LLM Search

Researchers developed Budget-Constrained Agentic Search (BCAS) to evaluate how search depth, retrieval strategies, and token budgets affect accuracy and cost in AI search systems. The study found that hybrid retrieval methods with lightweight re-ranking produce the largest gains, with accuracy improving up to a small cap of additional searches.

AIBullisharXiv – CS AI · Mar 37/107

🧠

Multimodal Mixture-of-Experts with Retrieval Augmentation for Protein Active Site Identification

Researchers introduce MERA (Multimodal Mixture-of-Experts with Retrieval Augmentation), a new AI framework for protein active site identification that addresses challenges in drug discovery. The system achieves 90% AUPRC performance on active site prediction through hierarchical multi-expert retrieval and reliability-aware fusion strategies.