16 articles tagged with #hallucination-detection. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AINeutralarXiv โ CS AI ยท 4d ago7/10
๐ง Researchers introduce VLM-DeflectionBench, a new benchmark with 2,775 samples designed to evaluate how large vision-language models handle conflicting or insufficient evidence. The study reveals that most state-of-the-art LVLMs fail to appropriately deflect when faced with noisy or misleading information, highlighting critical gaps in model reliability for knowledge-intensive tasks.
AIBullisharXiv โ CS AI ยท Apr 107/10
๐ง Researchers developed a weak supervision framework to detect hallucinations in large language models by distilling grounding signals into transformer representations during training. Using substring matching, sentence embeddings, and LLM judges, they created a 15,000-sample dataset and trained five probing classifiers that achieve hallucination detection from internal activations alone at inference time, eliminating the need for external verification systems.
AINeutralarXiv โ CS AI ยท Apr 107/10
๐ง Researchers demonstrate that standard LLM-as-a-judge methods achieve only 52% accuracy in detecting hallucinations and omissions in mental health chatbots, failing in high-risk healthcare contexts. A hybrid framework combining human domain expertise with machine learning features achieves significantly higher performance (0.717-0.849 F1 scores), suggesting that transparent, interpretable approaches outperform black-box LLM evaluation in safety-critical applications.
AIBearisharXiv โ CS AI ยท Apr 77/10
๐ง Researchers present a new framework for AI safety that identifies a 57-token predictive window for detecting potential failures in large language models. The study found that only one out of seven tested models showed predictive signals before committing to problematic outputs, while factual hallucinations produced no detectable warning signs.
AIBullisharXiv โ CS AI ยท Apr 77/10
๐ง Researchers developed an LLM-powered evolutionary search method to automatically design uncertainty quantification systems for large language models, achieving up to 6.7% improvement in performance over manual designs. The study found that different AI models employ distinct evolutionary strategies, with some favoring complex linear estimators while others prefer simpler positional weighting approaches.
๐ง Claude๐ง Sonnet๐ง Opus
AIBullisharXiv โ CS AI ยท Mar 267/10
๐ง Researchers developed SCoOP, a training-free framework that combines multiple Vision-Language Models to improve uncertainty quantification and reduce hallucinations in AI systems. The method achieves 10-13% better hallucination detection performance compared to existing approaches while adding only microsecond-level overhead to processing time.
AINeutralarXiv โ CS AI ยท Mar 127/10
๐ง Researchers introduce TRACED, a framework that evaluates AI reasoning quality through geometric analysis rather than traditional scalar probabilities. The system identifies correct reasoning as high-progress stable trajectories, while AI hallucinations show low-progress unstable patterns with high curvature fluctuations.
AI ร CryptoNeutralarXiv โ CS AI ยท Mar 127/10
๐คResearchers propose NabaOS, a lightweight verification framework that detects AI agent hallucinations using HMAC-signed tool receipts instead of zero-knowledge proofs. The system achieves 94.2% detection accuracy with <15ms verification time, compared to cryptographic approaches that require 180+ seconds per query.
AIBullisharXiv โ CS AI ยท Mar 57/10
๐ง Researchers propose LEAP, a new framework for detecting AI hallucinations using efficient small models that can dynamically adapt verification strategies. The system uses a teacher-student approach where a powerful model trains smaller ones to detect false outputs, addressing a critical barrier to safe AI deployment in production environments.
AIBullisharXiv โ CS AI ยท Mar 37/104
๐ง Researchers introduce HalluGuard, a new framework that identifies and addresses both data-driven and reasoning-driven hallucinations in Large Language Models. The system achieved state-of-the-art performance across 10 benchmarks and 9 LLM backbones, offering a unified approach to improve AI reliability in critical domains like healthcare and law.
AIBullisharXiv โ CS AI ยท Feb 277/107
๐ง Researchers have developed a unified framework using Spectral Geometry and Random Matrix Theory to address reliability and efficiency challenges in large language models. The study introduces EigenTrack for real-time hallucination detection and RMT-KD for model compression while maintaining accuracy.
AINeutralarXiv โ CS AI ยท Apr 106/10
๐ง Researchers propose a composite architecture combining instruction-based refusal with a structural abstention gate to reduce hallucinations in large language models. The system uses a support deficit score derived from self-consistency, paraphrase stability, and citation coverage to block unreliable outputs, achieving better accuracy than either mechanism alone across multiple models.
AIBullisharXiv โ CS AI ยท Mar 266/10
๐ง Researchers developed HalluJudge, a reference-free system to detect hallucinations in AI-generated code review comments, addressing a key challenge in LLM adoption for software development. The system achieves 85% F1 score with 67% alignment to developer preferences at just $0.009 average cost, making it a practical safeguard for AI-assisted code reviews.
AINeutralarXiv โ CS AI ยท Mar 126/10
๐ง Researchers have developed the System Hallucination Scale (SHS), a human-centered tool for evaluating hallucination behavior in large language models. The instrument showed strong statistical validity in testing with 210 participants and provides a practical method for assessing AI model reliability from a user perspective.
AIBullisharXiv โ CS AI ยท Mar 36/102
๐ง Researchers developed a training-free method to detect AI hallucinations by reinterpreting LLM output as Energy-Based Models and tracking 'energy spills' during text generation. The approach successfully identifies factual errors and biases across multiple state-of-the-art models including LLaMA, Mistral, and Gemma without requiring additional training or probe classifiers.
AINeutralApple Machine Learning ยท Mar 35/103
๐ง Researchers are developing new methods to detect hallucinations in large language models by identifying specific spans of unsupported content rather than making binary decisions. The study evaluates Chain-of-Thought reasoning approaches to improve the complex multi-step process of hallucination span detection in LLMs.