#knowledge-retrieval News & Analysis

13 articles tagged with #knowledge-retrieval. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

13 articles

AIBullisharXiv – CS AI · Jun 27/10

🧠

KACE: Knowledge-Adaptive Context Engineering for Mathematical Reasoning

Researchers introduce KACE, a novel context engineering method that improves large language models' mathematical reasoning by separating knowledge storage from usage through difficulty and domain-based organization. The approach achieves 62.2% accuracy on AIME 2025, significantly outperforming existing self-consistency methods while maintaining comparable computational efficiency.

AIBullisharXiv – CS AI · May 287/10

🧠

MemCog: From Memory-as-Tool to Memory-as-Cognition in Conversational Agents

Researchers introduce MemCog, a new memory system for conversational AI agents that integrates memory access into the reasoning process rather than treating it as a separate tool. The system uses associative link graphs and proactive reasoning to enable agents to autonomously explore relevant information, achieving state-of-the-art performance on multiple benchmarks including a newly created ProactiveMemBench.

AIBullisharXiv – CS AI · May 287/10

🧠

RAGe: A Retrieval-Augmented Generation Evaluation Framework

Researchers introduce RAGe, a benchmarking framework designed to optimize Retrieval-Augmented Generation (RAG) applications by evaluating trade-offs between accuracy, efficiency, and scalability. The framework enables developers to identify optimal pipeline components for domain-specific datasets while accounting for hardware constraints, making RAG development more accessible on consumer-grade hardware.

AIBullisharXiv – CS AI · May 127/10

🧠

VLADriver-RAG: Retrieval-Augmented Vision-Language-Action Models for Autonomous Driving

Researchers introduce VLADriver-RAG, a new framework that combines Vision-Language-Action models with retrieval-augmented generation for autonomous driving. By grounding decisions in explicit historical knowledge rather than relying solely on learned parameters, the system achieves state-of-the-art performance on the Bench2Drive benchmark with a Driving Score of 89.12, demonstrating improved generalization in complex driving scenarios.

AIBearisharXiv – CS AI · Mar 56/10

🧠

$\tau$-Knowledge: Evaluating Conversational Agents over Unstructured Knowledge

Researchers introduced τ-Knowledge, a new benchmark for evaluating AI conversational agents in knowledge-intensive environments, specifically testing their ability to retrieve and apply unstructured domain knowledge. Even frontier AI models achieved only 25.5% success rates when navigating complex fintech customer support scenarios with 700 interconnected knowledge documents.

AINeutralarXiv – CS AI · Jun 256/10

🧠

Reinforcement Learning Improves Traversal of Parametric Knowledge in LLMs

Researchers demonstrate that reinforcement learning improves large language models' ability to retrieve existing knowledge by teaching them better procedural skills for navigating internal knowledge hierarchies, rather than adding new information. The findings suggest future AI development should focus on optimizing how models traverse learned knowledge alongside expanding their training data.

AINeutralGoogle Research Blog · Jun 246/10

🧠

Thinking to recall: How reasoning unlocks parametric knowledge in LLMs

Researchers demonstrate that reasoning processes enable large language models to effectively recall and utilize parametric knowledge stored in their weights, challenging previous assumptions about knowledge retrieval mechanisms. This finding has significant implications for understanding how LLMs access information and suggests that explicit reasoning may be essential for optimal knowledge extraction.

AINeutralarXiv – CS AI · Jun 96/10

🧠

Understanding Benchmark Language Under Weakened Formal Semantics

Researchers propose a method to improve NLP benchmark understanding by extracting executable representations (computables) that provide operational evidence of semantic adequacy beyond traditional text-based reasoning. The approach demonstrates consistent improvements over baseline methods across mathematical reasoning, legal, and biomedical benchmarks while offering inspectable semantic evidence.

AINeutralarXiv – CS AI · Jun 86/10

🧠

Declarative Skills for AI Agents in Knowledge-Grounded Tool-Use Workflows

Researchers compare three orchestration approaches for AI agents handling customer-service workflows: declarative agents using natural-language skill files, imperative agents with programmatic state machines, and unscaffolded baseline agents. The study finds that retrieval quality is the dominant bottleneck, and declarative skills improve performance on procedural tasks only when evidence quality is high.

AINeutralarXiv – CS AI · May 286/10

🧠

Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions

Researchers present SLOT, a comprehensive taxonomy for understanding security vulnerabilities in retrieval-augmented generation (RAG) systems that extend LLMs with external knowledge. The framework categorizes attacks and defenses across four dimensions—attack surface, defense layer, security objective, and target scope—while identifying structural gaps in current evaluation methods and proposing future research directions for securing RAG pipelines.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Do We Still Need GraphRAG? Benchmarking RAG and GraphRAG for Agentic Search Systems

A new benchmark study (RAGSearch) evaluates whether agentic search systems can reduce the need for expensive GraphRAG pipelines by dynamically retrieving information across multiple rounds. Results show agentic search significantly improves standard RAG performance and narrows the gap to GraphRAG, though GraphRAG retains advantages for complex multi-hop reasoning tasks when preprocessing costs are considered.

🏢 Meta

AINeutralarXiv – CS AI · Apr 146/10

🧠

MERMAID: Memory-Enhanced Retrieval and Reasoning with Multi-Agent Iterative Knowledge Grounding for Veracity Assessment

Researchers introduce MERMAID, a memory-enhanced multi-agent framework for automated fact-checking that couples evidence retrieval with reasoning processes. The system achieves state-of-the-art performance on multiple benchmarks by reusing retrieved evidence across claims, reducing redundant searches and improving verification efficiency.

AIBullisharXiv – CS AI · Mar 116/10

🧠

PRECEPT: Planning Resilience via Experience, Context Engineering & Probing Trajectories A Unified Framework for Test-Time Adaptation with Compositional Rule Learning and Pareto-Guided Prompt Evolution

Researchers introduce PRECEPT, a new framework for AI language model agents that improves knowledge retrieval and adaptation through structured rule learning and conflict-aware memory systems. The framework shows significant performance improvements over existing methods, with 41% better first-try accuracy and enhanced compositional reasoning capabilities.