y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#activation-patching News & Analysis

1 article tagged with #activation-patching. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AINeutralarXiv – CS AI Β· 7h ago7/10
🧠

Hallucination as Trajectory Commitment: Causal Evidence for Asymmetric Attractor Dynamics in Transformer Generation

Researchers demonstrate through causal experiments that hallucinations in language models arise from early trajectory commitments governed by asymmetric attractor dynamics. Using controlled prompt bifurcation on Qwen2.5-1.5B, they show that 44% of test prompts diverge into factual or hallucinated outputs at the first token, with activation patterns revealing that corrupting correct trajectories is far easier than recovering hallucinated onesβ€”suggesting hallucination represents a stable but difficult-to-escape attractor state.