y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#retrieval-systems News & Analysis

13 articles tagged with #retrieval-systems. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

13 articles
AIBullisharXiv โ€“ CS AI ยท Mar 177/10
๐Ÿง 

FlashHead: Efficient Drop-In Replacement for the Classification Head in Language Model Inference

Researchers introduce FlashHead, a training-free replacement for classification heads in language models that delivers up to 1.75x inference speedup while maintaining accuracy. The innovation addresses a critical bottleneck where classification heads consume up to 60% of model parameters and 50% of inference compute in modern language models.

๐Ÿง  Llama
AINeutralarXiv โ€“ CS AI ยท Feb 277/105
๐Ÿง 

HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems

Researchers introduce HubScan, an open-source security scanner that detects 'hubness poisoning' attacks in Retrieval-Augmented Generation (RAG) systems. The tool achieves 90% recall at detecting adversarial content that exploits vector similarity search vulnerabilities, addressing a critical security flaw in AI systems that rely on external knowledge retrieval.

AINeutralarXiv โ€“ CS AI ยท Apr 76/10
๐Ÿง 

Rashomon Memory: Towards Argumentation-Driven Retrieval for Multi-Perspective Agent Memory

Researchers propose Rashomon Memory, a new AI agent memory architecture where multiple goal-conditioned agents maintain parallel interpretations of the same events and negotiate through argumentation at query time. The system allows AI agents to handle conflicting perspectives on experiences rather than forcing a single interpretation, using Dung's argumentation semantics to determine which proposals survive retrieval.

AINeutralarXiv โ€“ CS AI ยท Mar 266/10
๐Ÿง 

Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA

A research study on retrieval-augmented generation (RAG) systems for AI policy analysis found that improving retrieval quality doesn't necessarily lead to better question-answering performance. The research used 947 AI policy documents and discovered that stronger retrieval can paradoxically cause more confident hallucinations when relevant information is missing.

AINeutralarXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

Compute Allocation for Reasoning-Intensive Retrieval Agents

Researchers studied computational resource allocation in AI retrieval systems for long-horizon agents, finding that re-ranking stages benefit more from powerful models and deeper candidate pools than query expansion stages. The study suggests concentrating compute power on re-ranking rather than distributing it uniformly across pipeline stages for better performance.

๐Ÿง  Gemini
AIBullisharXiv โ€“ CS AI ยท Mar 166/10
๐Ÿง 

Structured Distillation for Personalized Agent Memory: 11x Token Reduction with Retrieval Preservation

Researchers developed a structured distillation method that compresses AI agent conversation history by 11x (from 371 to 38 tokens per exchange) while maintaining 96% of retrieval quality. The technique enables thousands of exchanges to fit within a single prompt at 1/11th the context cost, addressing the expensive verbatim storage problem for long AI conversations.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1012
๐Ÿง 

Democratizing GraphRAG: Linear, CPU-Only Graph Retrieval for Multi-Hop QA

Researchers present SPRIG, a CPU-only GraphRAG system that eliminates expensive LLM-based graph construction and GPU requirements for multi-hop question answering. The system uses lightweight NER-driven co-occurrence graphs with Personalized PageRank, achieving comparable performance while reducing computational costs by 28%.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1017
๐Ÿง 

Higress-RAG: A Holistic Optimization Framework for Enterprise Retrieval-Augmented Generation via Dual Hybrid Retrieval, Adaptive Routing, and CRAG

Researchers have developed Higress-RAG, a new enterprise-grade framework that addresses key challenges in Retrieval-Augmented Generation systems including low retrieval precision, hallucination, and high latency. The system introduces innovations like 50ms semantic caching, hybrid retrieval methods, and corrective evaluation to optimize the entire RAG pipeline for production use.

$LINK
AIBullisharXiv โ€“ CS AI ยท Feb 276/105
๐Ÿง 

Comparative Analysis of Neural Retriever-Reranker Pipelines for Retrieval-Augmented Generation over Knowledge Graphs in E-commerce Applications

Researchers developed improved neural retriever-reranker pipelines for Retrieval-Augmented Generation (RAG) systems over knowledge graphs in e-commerce applications. The study achieved 20.4% higher Hit@1 and 14.5% higher Mean Reciprocal Rank compared to existing benchmarks, providing a framework for production-ready RAG systems.

AIBullishHugging Face Blog ยท Oct 16/107
๐Ÿง 

Introducing RTEB: A New Standard for Retrieval Evaluation

The article introduces RTEB (Retrieval-augmented generation with Token-level Evaluation Benchmark), a new standard for evaluating retrieval systems in AI applications. This benchmark aims to provide more granular and accurate assessment of how well retrieval systems perform at the token level rather than traditional document-level metrics.

AINeutralarXiv โ€“ CS AI ยท Mar 25/107
๐Ÿง 

HotelQuEST: Balancing Quality and Efficiency in Agentic Search

Researchers introduce HotelQuEST, a new benchmark for evaluating agentic search systems that balances quality and efficiency metrics. The study reveals that while LLM-based agents achieve higher accuracy than traditional retrievers, they incur substantially higher costs due to redundant operations and poor optimization.

AIBullishGoogle Research Blog ยท Jun 254/106
๐Ÿง 

MUVERA: Making multi-vector retrieval as fast as single-vector search

MUVERA is a new algorithm that optimizes multi-vector retrieval systems to achieve performance speeds comparable to single-vector search methods. This represents a significant technical advancement in information retrieval and search algorithms, potentially improving efficiency for AI applications that rely on complex vector-based searches.