#hidden-states News & Analysis

5 articles tagged with #hidden-states. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

5 articles

AINeutralarXiv – CS AI · May 127/10

🧠

Hidden Error Awareness in Chain-of-Thought Reasoning: The Signal Is Diagnostic, Not Causal

Researchers discovered that large language models internally detect their own reasoning errors with 95% accuracy but verbally express unwarranted confidence in flawed outputs. Despite this hidden awareness, four intervention strategies failed to correct the errors, indicating the signal reflects computation quality rather than a mechanism that can be leveraged for improvement.

🧠 Llama

AINeutralarXiv – CS AI · May 97/10

🧠

Attractor Geometry of Transformer Memory: From Conflict Arbitration to Confident Hallucination

Researchers have identified a geometric framework explaining how language models fail through two distinct mechanisms: parametric memory conflicting with working memory, and hallucination from absent learned facts. Both failures produce confident outputs despite being mechanistically different, but hidden-state geometry and 'geometric margin' metrics can distinguish them more reliably than traditional entropy-based detection methods.

AIBullisharXiv – CS AI · May 47/10

🧠

To Call or Not to Call: A Framework to Assess and Optimize LLM Tool Calling

Researchers present a decision-making framework to optimize when large language models should call external tools like web search. The study reveals that models often misjudge their actual need for tool use, and proposes lightweight estimators trained on hidden states to improve tool-calling decisions, demonstrating performance gains across multiple tasks.

AIBullisharXiv – CS AI · Apr 207/10

🧠

Learning Uncertainty from Sequential Internal Dispersion in Large Language Models

Researchers introduce Sequential Internal Variance Representation (SIVR), a novel supervised framework for detecting hallucinations in large language models by analyzing token-wise and layer-wise variance patterns in hidden states. The method demonstrates superior generalization compared to existing approaches while requiring smaller training datasets, potentially enabling practical deployment of hallucination detection systems.

AINeutralarXiv – CS AI · May 276/10

🧠

Hidden-State Privacy Has an Empty Middle

Researchers demonstrate that Gaussian mechanisms for hidden-state privacy face a fundamental trade-off, with no configurations achieving both moderate utility and moderate privacy against adaptive attackers. A diagonal inverse-Fisher mechanism emerges as minimax-optimal but sits at the privacy-utility boundary rather than within an achievable middle ground, suggesting future work must redesign architectures rather than optimize within existing Gaussian frameworks.