AINeutralarXiv – CS AI · 9h ago6/10
🧠
Inference Time Causal Probing in LLMs
Researchers introduce Hidden-state Driven Margin Intervention (HDMI), a new probe-free technique for causal probing in large language models that directly manipulates hidden states without training auxiliary classifiers. The method achieves higher reliability than existing approaches by balancing completeness and selectivity across multiple benchmarks.
🧠 Llama