#causal-analysis News & Analysis

14 articles tagged with #causal-analysis. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

14 articles

AIBearisharXiv – CS AI · Jun 17/10

🧠

Mechanistic Interpretability as Statistical Estimation: A Variance Analysis

Researchers demonstrate that mechanistic interpretability—the process of reverse-engineering AI model behaviors through circuit discovery—suffers from fundamental statistical instability due to high variance in causal mediation analysis. The findings reveal that circuit structures are fragile and highly sensitive to input data and hyperparameter changes, calling into question the scientific validity of existing MI methodologies and necessitating stricter statistical practices in the field.

AINeutralarXiv – CS AI · May 277/10

🧠

Emergent Causal-Geometric Dynamics Across Depth in Large Language Models

Researchers have synthesized geometric and causal analysis approaches to explain how large language models transform context into predictions across layers, identifying a sharp computational transition in decoder-only LLMs and revealing that angular structure in late layers governs token prediction while representation norms operate independently.

AINeutralarXiv – CS AI · May 127/10

🧠

Causal Dimensionality of Transformer Representations: Measurement, Scaling, and Layer Structure

Researchers introduce causal dimensionality (kappa), a measurable property quantifying how transformer layers causally influence model outputs, finding that representational capacity grows 15.6x faster than causal capacity across scaling conditions. The metric remains invariant to model size increases, suggesting causal influence is a fundamental architectural property independent of parameter count.

AINeutralarXiv – CS AI · Apr 207/10

🧠

Hallucination as Trajectory Commitment: Causal Evidence for Asymmetric Attractor Dynamics in Transformer Generation

Researchers demonstrate through causal experiments that hallucinations in language models arise from early trajectory commitments governed by asymmetric attractor dynamics. Using controlled prompt bifurcation on Qwen2.5-1.5B, they show that 44% of test prompts diverge into factual or hallucinated outputs at the first token, with activation patterns revealing that corrupting correct trajectories is far easier than recovering hallucinated ones—suggesting hallucination represents a stable but difficult-to-escape attractor state.

AINeutralarXiv – CS AI · Apr 147/10

🧠

Why Do Large Language Models Generate Harmful Content?

Researchers used causal mediation analysis to identify why large language models generate harmful content, discovering that harmful outputs originate in later model layers primarily through MLP blocks rather than attention mechanisms. Early layers develop contextual understanding of harmfulness that propagates through the network to sparse neurons in final layers that act as gating mechanisms for harmful generation.

AINeutralarXiv – CS AI · Apr 147/10

🧠

Thought Branches: Interpreting LLM Reasoning Requires Resampling

Researchers demonstrate that interpreting large language model reasoning requires analyzing distributions of possible reasoning chains rather than single examples. By resampling text after specific points, they show that stated reasons often don't causally drive model decisions, off-policy interventions are unstable, and hidden contextual hints exert cumulative influence even when explicitly removed.

AINeutralarXiv – CS AI · Jun 106/10

🧠

Inside the Latent Flow: Causal Deciphering of Attention Dynamics in Audio Separation Foundation Models

Researchers have developed a causal analysis framework to understand how attention mechanisms work in SAM Audio, a flow-matching transformer for audio separation. The study reveals a dual-pathway conditioning system and proposes Layer-Selective Attention Caching (LSAC), a training-free optimization technique that reduces computational overhead by ~25% while maintaining audio quality.

AINeutralarXiv – CS AI · Jun 96/10

🧠

How Many Counterfactuals Does It Take? Probing VLM Hallucinations Through Circuits and Causal Effects

Researchers present a novel methodology for detecting hallucinations in Visual Language Models by measuring sample complexity under counterfactual perturbations. Using circuit discovery techniques and causal influence metrics, they establish empirical bounds on the minimum counterfactual samples needed to reliably identify unstable hallucinated predictions.

AINeutralarXiv – CS AI · Jun 86/10

🧠

TRUE: A Trustworthy Unified Explanation Framework for Large Language Model Reasoning

Researchers introduce TRUE (Trustworthy Unified Explanation Framework), a new methodology for interpreting and verifying the reasoning processes of large language models across multiple analytical levels. The framework combines executable verification, structural analysis, and causal failure mode detection to provide transparent insights into LLM decision-making, addressing critical gaps in current interpretability methods.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Answer Presence Drives RAG Rewriting Gains

A new research audit challenges the assumed benefits of LLM rewriters in retrieval-augmented QA systems, finding that performance gains stem primarily from the presence of gold answer strings in rewritten context rather than from genuine passage curation. The study introduces controlled intervention methods to test rewriter claims, revealing that conventional evaluation probes are sensitive to methodology choices and may report misleading results.

AINeutralarXiv – CS AI · May 276/10

🧠

ORCA: An End-to-End Interactive Copilot for Optimized Root Cause Analysis

Researchers have introduced ORCA, an AI copilot system designed to make causal analysis accessible to domain experts across manufacturing, medicine, and social science. The tool automates root cause analysis workflows while allowing users to control the level of automation, from fully automatic to highly guided execution, addressing a significant accessibility gap in complex analytical methods.

🏢 Microsoft

AINeutralarXiv – CS AI · May 116/10

🧠

Is Chain-of-Thought Really Not Explainability? Chain-of-Thought Can Be Faithful without Hint Verbalization

Researchers challenge recent claims that Chain-of-Thought (CoT) reasoning in language models is unfaithful when it omits prompt-injected hints. The study argues the Biasing Features metric conflates incompleteness with unfaithfulness, and demonstrates through multiple evaluation approaches that non-verbalized hints can still causally influence predictions, suggesting token constraints rather than model deception explain missing hint mentions.

AINeutralarXiv – CS AI · May 46/10

🧠

Minimal, Local, Causal Explanations for Jailbreak Success in Large Language Models

Researchers introduce LOCA, a method for identifying why specific jailbreak attacks succeed against safety-trained LLMs by pinpointing minimal, causal changes in intermediate representations. The approach provides local explanations for individual jailbreak instances rather than global theories, achieving refusal induction with an average of six interpretable changes compared to prior methods requiring 20+.

🧠 Llama

AINeutralarXiv – CS AI · Mar 34/107

🧠

Econometric vs. Causal Structure-Learning for Time-Series Policy Decisions: Evidence from the UK COVID-19 Policies

A research study compares econometric methods versus causal machine learning algorithms for analyzing time-series data to inform policy decisions, using UK COVID-19 policies as a case study. The research evaluates four econometric methods against eleven causal ML algorithms, finding that econometric methods provide clearer temporal structure rules while causal ML algorithms explore broader graph structures to capture more causal relationships.