20 articles tagged with #hallucination. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullisharXiv – CS AI · Apr 77/10
🧠Researchers introduce V-Reflection, a new framework that transforms Multimodal Large Language Models (MLLMs) from passive observers to active interrogators through a 'think-then-look' mechanism. The approach addresses perception-related hallucinations in fine-grained tasks by allowing models to dynamically re-examine visual details during reasoning, showing significant improvements across six perception-intensive benchmarks.
AIBullisharXiv – CS AI · Apr 77/10
🧠Researchers propose PassiveQA, a new AI framework that teaches language models to recognize when they don't have enough information to answer questions, choosing to ask for clarification or abstain rather than hallucinate responses. The three-action system (Answer, Ask, Abstain) uses supervised fine-tuning to align model behavior with information sufficiency, showing significant improvements in reducing hallucinations.
AIBullisharXiv – CS AI · Apr 77/10
🧠Researchers introduce a geometric framework for understanding LLM hallucinations, showing they arise from basin structures in latent space that vary by task complexity. The study demonstrates that factual tasks have clearer separation while summarization tasks show unstable, overlapping patterns, and proposes geometry-aware steering to reduce hallucinations without retraining.
AIBullisharXiv – CS AI · Apr 67/10
🧠Researchers propose Council Mode, a multi-agent consensus framework that reduces AI hallucinations by 35.9% by routing queries to multiple diverse LLMs and synthesizing their outputs through a dedicated consensus model. The system operates through intelligent triage classification, parallel expert generation, and structured consensus synthesis to address factual accuracy issues in large language models.
AINeutralarXiv – CS AI · Apr 67/10
🧠Researchers propose the Hallucination-as-Cue Framework to analyze reinforcement learning's effectiveness in training multimodal AI models. The study reveals that RL training can improve reasoning performance even under hallucination-inductive conditions, challenging assumptions about how these models learn from visual information.
AIBullisharXiv – CS AI · Mar 117/10
🧠Researchers introduce MMGraphRAG, a new AI framework that addresses hallucination issues in large language models by integrating visual scene graphs with text knowledge graphs through cross-modal fusion. The system uses SpecLink for entity linking and demonstrates superior performance in multimodal information processing across multiple benchmarks.
AIBullisharXiv – CS AI · Mar 56/10
🧠Researchers developed MA-RAG, a Multi-Round Agentic RAG framework that improves medical AI reasoning by iteratively refining responses through conflict detection and external evidence retrieval. The system achieved a substantial +6.8 point accuracy improvement over baseline models across 7 medical Q&A benchmarks by addressing hallucinations and outdated knowledge in healthcare AI applications.
AIBullisharXiv – CS AI · Mar 47/103
🧠Researchers conducted the first empirical investigation of hallucination in large language models, revealing that strategic repetition of just 5% of training examples can reduce AI hallucinations by up to 40%. The study introduces 'selective upweighting' as a technique that maintains model accuracy while significantly reducing false information generation.
AIBullisharXiv – CS AI · Feb 277/107
🧠Researchers introduce Spatial Credit Redistribution (SCR), a training-free method that reduces hallucination in vision-language models by 4.7-6.0 percentage points. The technique redistributes attention from dominant visual patches to contextual areas, addressing the spatial credit collapse problem that causes AI models to generate false objects.
AIBullishOpenAI News · Sep 57/107
🧠OpenAI has published new research explaining the underlying causes of language model hallucinations. The study demonstrates how better evaluation methods can improve AI systems' reliability, honesty, and safety performance.
AINeutralarXiv – CS AI · Mar 266/10
🧠Researchers identify 'multi-view hallucination' as a major problem in large vision-language models (LVLMs), where these AI systems confuse visual information from different viewpoints or instances. They created MVH-Bench benchmark and developed Reference Shift Contrastive Decoding (RSCD) technique, which improved performance by up to 34.6 points without requiring model retraining.
AINeutralarXiv – CS AI · Mar 266/10
🧠A research study on retrieval-augmented generation (RAG) systems for AI policy analysis found that improving retrieval quality doesn't necessarily lead to better question-answering performance. The research used 947 AI policy documents and discovered that stronger retrieval can paradoxically cause more confident hallucinations when relevant information is missing.
AIBullisharXiv – CS AI · Mar 176/10
🧠Researchers propose EMBRAG, a new framework that combines large language models with knowledge graphs to improve reasoning accuracy and reduce hallucinations. The system generates multiple logical rules from queries and applies them in embedding space, achieving state-of-the-art performance on knowledge graph question-answering benchmarks.
AIBullisharXiv – CS AI · Mar 66/10
🧠Researchers propose CTRL-RAG, a new reinforcement learning framework that improves large language models' ability to generate accurate, context-faithful responses in Retrieval-Augmented Generation systems. The method uses a Contrastive Likelihood Reward mechanism that optimizes the difference between responses with and without supporting evidence, addressing issues of hallucination and model collapse in existing RAG systems.
AIBullisharXiv – CS AI · Mar 36/104
🧠Researchers propose ChainMPQ, a training-free method to reduce relation hallucinations in Large Vision-Language Models (LVLMs) by using interleaved text-image reasoning chains. The approach addresses the most common but least studied type of AI hallucination by sequentially analyzing subjects, objects, and their relationships through multi-perspective questioning.
AIBullisharXiv – CS AI · Mar 26/1021
🧠Researchers propose a training-free solution to reduce hallucinations in multimodal AI models by rebalancing attention between perception and reasoning layers. The method achieves 4.2% improvement in reasoning accuracy with minimal computational overhead.
AIBearisharXiv – CS AI · Mar 27/1019
🧠Researchers propose a new risk-sensitive framework for evaluating AI hallucinations in medical advice that considers potential harm rather than just factual accuracy. The study reveals that AI models with similar performance show vastly different risk profiles when generating medical recommendations, highlighting critical safety gaps in current evaluation methods.
AINeutralarXiv – CS AI · Feb 276/107
🧠Researchers developed a method to identify whether large language model outputs come from user prompts or internal training data, addressing the problem of AI hallucinations. Their linear classifier probe achieved up to 96% accuracy in determining knowledge sources, with attribution mismatches increasing error rates by up to 70%.
$LINK
AINeutralLil'Log (Lilian Weng) · Jul 75/10
🧠This article defines and categorizes hallucination in large language models, specifically focusing on extrinsic hallucination where model outputs are not grounded in world knowledge. The author distinguishes between in-context hallucination (inconsistent with provided context) and extrinsic hallucination (not verifiable by external knowledge), emphasizing that LLMs must be factual and acknowledge uncertainty to avoid fabricating information.
AINeutralHugging Face Blog · Jan 124/106
🧠This article provides a comprehensive guide for creating custom leaderboards on Hugging Face, using Vectara's hallucination leaderboard as a practical example. It covers the technical setup process and demonstrates how organizations can build their own evaluation frameworks for AI models.