#fact-checking News & Analysis

22 articles tagged with #fact-checking. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

22 articles

AI × CryptoBearishCrypto Briefing · 2d ago7/10

🤖

Lenz Research study finds AI models disagree on 67% of fact-check claims

A Lenz Research study reveals that AI models disagree on 67% of fact-checking claims, underscoring significant inconsistencies in how different AI systems evaluate information accuracy. The finding highlights critical gaps in AI reliability and emphasizes the necessity for human oversight and diverse information sources, particularly in high-stakes environments like cryptocurrency markets.

AIBearishDecrypt · 2d ago7/10

🧠

AI Models Can’t Agree on Basic Facts Most of the Time, Study Shows

A new study found that five frontier AI models disagreed on how to fact-check 67% of 1,000 real-world claims, raising critical concerns about AI reliability and consistency. This inconsistency highlights fundamental limitations in current large language models that could impact their deployment in high-stakes applications requiring factual accuracy.

AIBullisharXiv – CS AI · 3d ago7/10

🧠

Teaching Language Models to Check Grounded Claim Factuality with Human Test-Taking Strategies

Researchers have developed a method to improve how large language models verify factual claims by framing fact-checking as a true/false reading comprehension task with explicit test-taking strategies. The approach reduces token usage by over 80% while maintaining competitive performance, and enables smaller language models to perform similarly to larger ones through fine-tuning and self-revision mechanisms.

AINeutralarXiv – CS AI · 4d ago7/10

🧠

The Future of Facts: Tracing the Factual Generation-Verification Gap

Researchers reveal that language models verify factual information more reliably than they generate it, a phenomenon driven by distinct training dynamics rather than computational limitations. The study traces this generation-verification gap across model families and training phases, finding that models can simultaneously accept contradictory facts after updates, creating consistency issues for AI systems deployed as knowledge interfaces.

AIBullisharXiv – CS AI · 4d ago7/10

🧠

DecomposeRL: Learning to Ask Useful, Informative, and Diverse Questions for Semi-Supervised, Traceable Claim Verification

DecomposeRL presents a novel reinforcement learning approach to claim verification that achieves high accuracy while maintaining interpretability through decomposition-based reasoning. A 7B parameter model trained on just 5K curated claims matches 32B baselines and GPT-4.1-mini across 11 benchmarks while enabling semi-supervised learning, demonstrating efficient scaling through intelligent data curation.

🧠 GPT-4

AINeutralarXiv – CS AI · May 17/10

🧠

The Impact of AI-Generated Text on the Internet

A comprehensive study using Internet Archive data reveals that approximately 35% of newly published websites by mid-2025 contain AI-generated or AI-assisted text, up from zero before ChatGPT's launch in late 2022. While the research finds statistical support for concerns about reduced semantic diversity and increased positive sentiment bias, it contradicts public fears about declining factual accuracy and stylistic diversity, highlighting a significant gap between perceived and measured impacts of AI-generated content.

🧠 ChatGPT

AIBullisharXiv – CS AI · May 17/10

🧠

VeriTaS: The First Dynamic Benchmark for Multimodal Automated Fact-Checking

Researchers have introduced VeriTaS, a dynamic benchmark for evaluating automated fact-checking systems across 25,000 real-world claims in 54 languages and multiple media formats. Unlike static benchmarks vulnerable to data leakage from LLM pretraining, VeriTaS updates quarterly with claims from 104 professional fact-checkers, maintaining relevance as foundation models evolve.

AIBearisharXiv – CS AI · Mar 177/10

🧠

DECEIVE-AFC: Adversarial Claim Attacks against Search-Enabled LLM-based Fact-Checking Systems

Researchers developed DECEIVE-AFC, an adversarial attack framework that can significantly compromise AI-based fact-checking systems by manipulating claims to disrupt evidence retrieval and reasoning. The attacks reduced fact-checking accuracy from 78.7% to 53.7% in testing, highlighting major vulnerabilities in LLM-based verification systems.

AIBullisharXiv – CS AI · 3d ago6/10

🧠

Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation

Researchers introduce Ptah, a multi-agent AI system designed to generate verifiable multimodal research reports by orchestrating planning, evidence collection, and writing stages while maintaining visual-text consistency. The system includes a verification agent to enforce factual grounding and citation accuracy, addressing a key limitation in LLM-generated long-form content that combines text and images.

AINeutralarXiv – CS AI · 4d ago6/10

🧠

Checking Fact with Better Retrieval: Dynamic Contrastive Learning for Evidence Retrieval

Researchers propose DACLR, a dynamic contrastive learning method that improves evidence retrieval for multimodal fact-checking by converting diverse media types to text and extracting event-level features. The approach uses a two-stage recall-rerank system with adaptive loss functions to better match claims with relevant evidence rather than merely semantically similar content.

AINeutralarXiv – CS AI · 4d ago6/10

🧠

CiteCheck: Retrieval-Grounded Detection of LLM Citation Hallucinations in Scientific Text

Researchers introduce CiteCheck, a hybrid framework that detects when large language models fabricate or corrupt scientific citations by combining scholarly database retrieval with structured LLM verification. The system achieves 88.7% macro-F1 on a new 982-citation physics benchmark, outperforming GPT, Claude, and Gemini, addressing a critical reliability problem as LLMs become integrated into scientific research workflows.

🧠 Claude🧠 Gemini

AINeutralarXiv – CS AI · 4d ago6/10

🧠

The Decision to Verify: How Warmth and User Characteristics Shape Reliance on Conversational Agents for Information Search

A research study examines how users interact with conversational AI systems when fact-checking is accessible through hybrid search interfaces. The findings reveal that users continue to over-rely on AI answers despite having web search available, with verification behavior driven primarily by user characteristics like prior trust rather than answer quality, while conversational warmth indirectly increases reliance by boosting agreement with incorrect responses.

AIBearishArs Technica – AI · May 226/10

🧠

AI put "synthetic quotes" in his book. But this author wants to keep using it.

Author Steven Rosenbaum included inaccurate quotes generated by AI in his book 'The Future of Truth,' raising questions about AI's role in content creation and factual accuracy. Despite acknowledging the error, Rosenbaum indicates he plans to continue using similar AI tools, highlighting the tension between AI efficiency and editorial integrity in publishing.

AINeutralarXiv – CS AI · Apr 146/10

🧠

MERMAID: Memory-Enhanced Retrieval and Reasoning with Multi-Agent Iterative Knowledge Grounding for Veracity Assessment

Researchers introduce MERMAID, a memory-enhanced multi-agent framework for automated fact-checking that couples evidence retrieval with reasoning processes. The system achieves state-of-the-art performance on multiple benchmarks by reusing retrieved evidence across claims, reducing redundant searches and improving verification efficiency.

AINeutralarXiv – CS AI · Apr 106/10

🧠

A Graph-Enhanced Defense Framework for Explainable Fake News Detection with LLM

Researchers propose G-Defense, a graph-enhanced framework that uses large language models and retrieval-augmented generation to detect fake news while providing explainable, fine-grained reasoning. The system decomposes news claims into sub-claims, retrieves competing evidence, and generates transparent explanations without requiring verified fact-checking databases.

AINeutralarXiv – CS AI · Apr 76/10

🧠

FactReview: Evidence-Grounded Reviews with Literature Positioning and Execution-Based Claim Verification

Researchers introduce FactReview, an AI system that improves academic peer review by combining claim extraction, literature positioning, and code execution to verify research claims. The system addresses weaknesses in current LLM-based reviewing by grounding assessments in external evidence rather than relying solely on manuscript narratives.

$MKR

AIBullisharXiv – CS AI · Apr 76/10

🧠

Schema-Aware Planning and Hybrid Knowledge Toolset for Reliable Knowledge Graph Triple Verification

Researchers have developed SHARP, a new AI agent that significantly improves knowledge graph verification by combining internal structural data with external evidence. The system achieved 4.2% and 12.9% accuracy improvements over existing methods on major datasets, offering better interpretability for complex fact verification tasks.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Distilling Reasoning Without Knowledge: A Framework for Reliable LLMs

Researchers propose a new framework for large language models that separates planning from factual retrieval to improve reliability in fact-seeking question answering. The modular approach uses a lightweight student planner trained via teacher-student learning to generate structured reasoning steps, showing improved accuracy and speed on challenging benchmarks.

AINeutralarXiv – CS AI · Mar 176/10

🧠

MALicious INTent Dataset and Inoculating LLMs for Enhanced Disinformation Detection

Researchers released MALINT, the first human-annotated English dataset for detecting disinformation and its malicious intent, developed with expert fact-checkers. The study benchmarked 12 language models and introduced intent-based inoculation techniques that improved zero-shot disinformation detection across six datasets, five LLMs, and seven languages.

🧠 Llama

AINeutralThe Verge – AI · Mar 36/104

🧠

Here’s how journalists spot deepfakes

Following recent military strikes on Iran, floods of fake images and videos have appeared online, including AI-generated content and footage from video games like War Thunder. Reputable news organizations like The New York Times, Indicator, and Bellingcat use extensive verification procedures to combat the spread of synthetic and misleading content during major news events.

AIBearisharXiv – CS AI · Feb 276/105

🧠

Misinformation Exposure in the Chinese Web: A Cross-System Evaluation of Search Engines, LLMs, and AI Overviews

Researchers analyzed factual accuracy of Chinese web information systems, comparing traditional search engines, standalone LLMs, and AI overviews using 12,161 real-world queries. The study found substantial differences in factual accuracy across systems and estimated potential misinformation exposure for Chinese users.

AINeutralarXiv – CS AI · Mar 35/106

🧠

Multi-Sourced, Multi-Agent Evidence Retrieval for Fact-Checking

Researchers propose WKGFC, a new AI system that uses knowledge graphs and multi-agent retrieval to improve fact-checking accuracy. The system addresses limitations of current methods that rely on textual similarity by implementing an automated Markov Decision Process with LLM agents to retrieve and verify evidence from multiple sources.