#retrieval-systems News & Analysis

28 articles tagged with #retrieval-systems. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

28 articles

AIBearisharXiv – CS AI · Jun 57/10

🧠

RAG Security and Privacy: Formalizing the Threat Model and Attack Surface

Researchers propose the first formal threat model for Retrieval-Augmented Generation (RAG) systems, which combine LLMs with external document retrieval. The framework identifies new security vulnerabilities including document membership inference and data poisoning attacks that emerge from RAG's reliance on external knowledge bases, addressing a critical gap in AI safety research.

AIBearisharXiv – CS AI · Jun 27/10

🧠

DiscourseFlip: An Oblique Discourse-Level Opinion Manipulation Attack against Black-box Retrieval-Augmented Generation

Researchers introduce DiscourseFlip, a novel attack method against Retrieval-Augmented Generation (RAG) systems that manipulates opinions across multiple related queries by poisoning retrieval content at the discourse level. Unlike previous attacks targeting individual queries, this coordinated approach induces broader opinion shifts while evading detection, and existing defenses prove ineffective against it.

AIBullisharXiv – CS AI · May 297/10

🧠

OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources

OmniRetrieval is a new framework that enables unified retrieval across heterogeneous knowledge sources—including unstructured text, relational databases, knowledge graphs, and property graphs—by translating natural language queries into source-native queries rather than forcing all data into a homogenized format. The system demonstrates superior performance compared to single-source retrievers across 13 datasets and 309 knowledge bases, positioning it as a general-purpose interface that preserves the structural advantages of each knowledge source.

AIBearisharXiv – CS AI · May 287/10

🧠

MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks

Researchers present MM-PoisonRAG, a framework demonstrating critical vulnerabilities in multimodal RAG systems where adversaries can inject poisoned content into knowledge bases to manipulate AI outputs. Two attack strategies—localized poisoning targeting specific queries and globalized poisoning affecting all queries—achieve high success rates and bypass existing defenses, exposing fundamental security gaps in RAG-augmented language models.

AIBullisharXiv – CS AI · Mar 177/10

🧠

FlashHead: Efficient Drop-In Replacement for the Classification Head in Language Model Inference

Researchers introduce FlashHead, a training-free replacement for classification heads in language models that delivers up to 1.75x inference speedup while maintaining accuracy. The innovation addresses a critical bottleneck where classification heads consume up to 60% of model parameters and 50% of inference compute in modern language models.

🧠 Llama

AINeutralarXiv – CS AI · Feb 277/105

🧠

HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems

Researchers introduce HubScan, an open-source security scanner that detects 'hubness poisoning' attacks in Retrieval-Augmented Generation (RAG) systems. The tool achieves 90% recall at detecting adversarial content that exploits vector similarity search vulnerabilities, addressing a critical security flaw in AI systems that rely on external knowledge retrieval.

AINeutralarXiv – CS AI · Jun 196/10

🧠

ELVA: Exploring Ranking-Driven Universal Multimodal Retrieval

Researchers introduce ELVA, a reinforcement learning framework that improves multimodal retrieval by addressing 'grain blindness'—where models fail to capture fine-grained query details. The approach treats negative samples with varying importance based on similarity and achieves 13.1% improvement on a new MRBench benchmark designed for multi-grain queries.

AINeutralarXiv – CS AI · Jun 116/10

🧠

When Poison Fails After Retrieval: Revisiting Corpus Poisoning under Chunking and Reranking Pipelines

Researchers demonstrate that existing corpus poisoning attacks against RAG systems fail significantly after reranking stages, revealing a critical gap between retrieval-stage attacks and real-world multi-stage pipelines. They propose CRCP, a new poisoning framework that accounts for document chunking and reranking to achieve higher attack success rates across realistic retrieval configurations.

AINeutralarXiv – CS AI · Jun 96/10

🧠

Conan-embedding-v3: Fusing Modality-Specific Models for Omni-Modal Embedding

Researchers introduce Conan-embedding-v3, a framework that enables unified embedding spaces across multiple data modalities (text, image, video, audio, documents) by training specialized models independently and fusing them into a single backbone. The approach identifies and solves a critical technical challenge called 'Projector Drift' that causes audio retrieval performance degradation when external encoders are integrated.

AIBullisharXiv – CS AI · Jun 96/10

🧠

DySink: Dynamic Frame Sinks for Autoregressive Long Video Generation

Researchers introduce DySink, a novel framework for autoregressive long video generation that dynamically selects relevant historical frames instead of using static early-frame anchors. The method addresses the problem of outdated context degrading video quality and introduces a sink anomaly gate to prevent content collapse, demonstrating improvements in temporal consistency for minute-long videos.

AIBullisharXiv – CS AI · Jun 96/10

🧠

How Many Tools Should an LLM Agent See? A Chance-Corrected Answer

Researchers propose Bits-over-Random (BoR), a chance-corrected metric to determine optimal tool shortlist sizes for LLM agents, and develop a reinforcement learning approach that dynamically adjusts how many tools to show per query. Testing across benchmarks with 20-3,251 tools demonstrates that adaptive shortlists significantly improve both tool retrieval and LLM selection accuracy while reducing cognitive overload.

🧠 Claude🧠 Sonnet

AINeutralarXiv – CS AI · Jun 46/10

🧠

ANN Search: Recall What Matters

Researchers propose replacing Recall@k with 1/Ratio@k as the standard metric for evaluating approximate nearest neighbor (ANN) search algorithms. The new metric measures actual distance quality rather than overlap with true neighbors, achieving operational thresholds at substantially lower computational cost while better tracking real-world task performance in classification and retrieval-augmented generation.

AINeutralarXiv – CS AI · May 296/10

🧠

Rethinking Literature Search Evaluation: Deep Research Helps, and Human Citation Lists Are Not a Ground Truth

Researchers demonstrate that deep literature search pipelines dramatically improve retrieval performance (from ~20% to 80% recall) compared to basic API searches, while simultaneously revealing that human citation lists contain significant bias and are unsuitable as ground truth for evaluation. The study advocates for multi-dimensional evaluation metrics beyond simple recall to assess citation quality accurately.

AINeutralarXiv – CS AI · May 296/10

🧠

Entity-Collision: A Stratified Protocol for Attributing Retrieval Lift in Agent Memory

Researchers propose entity-collision, a standardized testing protocol for evaluating retrieval systems in agent memory applications. The protocol isolates embedder performance from lexical overlap by construction, revealing that encoder capacity alone doesn't guarantee better retrieval—MiniLM-384 outperforms larger models on mixed query types despite having fewer parameters than BGE-large.

AINeutralarXiv – CS AI · May 286/10

🧠

RE-TRIANGLE: Does TRIANGLE Enable Multimodal Alignment Beyond Cosine Similarity in Retrieval?

A reproducibility study of the TRIANGLE framework reveals that geometric alignment on hyperspheres improves multimodal retrieval beyond traditional pairwise approaches, achieving up to 8.7 point gains in zero-shot settings. However, researchers identified critical optimization instabilities when jointly training with data-text matching loss and reduced cross-dataset generalization with fine-tuning, suggesting the method's benefits are context-dependent rather than universally applicable.

AINeutralarXiv – CS AI · May 46/10

🧠

A Survey of Reasoning-Intensive Retrieval: Progress and Challenges

A comprehensive survey systematizes Reasoning-Intensive Retrieval (RIR), a rapidly emerging field that integrates Large Language Model reasoning capabilities into information retrieval systems. The study provides the first structured framework organizing RIR benchmarks, methods, and taxonomies to guide future research in this fragmented but high-growth area.

AINeutralarXiv – CS AI · Apr 156/10

🧠

Cooperative Memory Paging with Keyword Bookmarks for Long-Horizon LLM Conversations

Researchers propose cooperative paging, a method for managing long LLM conversations by replacing evicted context with compact keyword bookmarks and providing a recall tool for on-demand retrieval. The technique outperforms existing solutions on the LoCoMo benchmark across multiple models, though bookmark discrimination remains a critical limitation.

🧠 GPT-4🧠 Claude

AINeutralarXiv – CS AI · Apr 76/10

🧠

Rashomon Memory: Towards Argumentation-Driven Retrieval for Multi-Perspective Agent Memory

Researchers propose Rashomon Memory, a new AI agent memory architecture where multiple goal-conditioned agents maintain parallel interpretations of the same events and negotiate through argumentation at query time. The system allows AI agents to handle conflicting perspectives on experiences rather than forcing a single interpretation, using Dung's argumentation semantics to determine which proposals survive retrieval.

AINeutralarXiv – CS AI · Mar 266/10

🧠

Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA

A research study on retrieval-augmented generation (RAG) systems for AI policy analysis found that improving retrieval quality doesn't necessarily lead to better question-answering performance. The research used 947 AI policy documents and discovered that stronger retrieval can paradoxically cause more confident hallucinations when relevant information is missing.

AINeutralarXiv – CS AI · Mar 176/10

🧠

Compute Allocation for Reasoning-Intensive Retrieval Agents

Researchers studied computational resource allocation in AI retrieval systems for long-horizon agents, finding that re-ranking stages benefit more from powerful models and deeper candidate pools than query expansion stages. The study suggests concentrating compute power on re-ranking rather than distributing it uniformly across pipeline stages for better performance.

🧠 Gemini

AIBullisharXiv – CS AI · Mar 166/10

🧠

Structured Distillation for Personalized Agent Memory: 11x Token Reduction with Retrieval Preservation

Researchers developed a structured distillation method that compresses AI agent conversation history by 11x (from 371 to 38 tokens per exchange) while maintaining 96% of retrieval quality. The technique enables thousands of exchanges to fit within a single prompt at 1/11th the context cost, addressing the expensive verbatim storage problem for long AI conversations.

AIBullisharXiv – CS AI · Mar 26/1012

🧠

Democratizing GraphRAG: Linear, CPU-Only Graph Retrieval for Multi-Hop QA

Researchers present SPRIG, a CPU-only GraphRAG system that eliminates expensive LLM-based graph construction and GPU requirements for multi-hop question answering. The system uses lightweight NER-driven co-occurrence graphs with Personalized PageRank, achieving comparable performance while reducing computational costs by 28%.

AIBullisharXiv – CS AI · Mar 26/1017

🧠

Higress-RAG: A Holistic Optimization Framework for Enterprise Retrieval-Augmented Generation via Dual Hybrid Retrieval, Adaptive Routing, and CRAG

Researchers have developed Higress-RAG, a new enterprise-grade framework that addresses key challenges in Retrieval-Augmented Generation systems including low retrieval precision, hallucination, and high latency. The system introduces innovations like 50ms semantic caching, hybrid retrieval methods, and corrective evaluation to optimize the entire RAG pipeline for production use.

$LINK

AIBullisharXiv – CS AI · Feb 276/105

🧠

Comparative Analysis of Neural Retriever-Reranker Pipelines for Retrieval-Augmented Generation over Knowledge Graphs in E-commerce Applications

Researchers developed improved neural retriever-reranker pipelines for Retrieval-Augmented Generation (RAG) systems over knowledge graphs in e-commerce applications. The study achieved 20.4% higher Hit@1 and 14.5% higher Mean Reciprocal Rank compared to existing benchmarks, providing a framework for production-ready RAG systems.

AIBullishHugging Face Blog · Oct 16/107

🧠

Introducing RTEB: A New Standard for Retrieval Evaluation

The article introduces RTEB (Retrieval-augmented generation with Token-level Evaluation Benchmark), a new standard for evaluating retrieval systems in AI applications. This benchmark aims to provide more granular and accurate assessment of how well retrieval systems perform at the token level rather than traditional document-level metrics.

Page 1 of 2Next →