#linear-probes News & Analysis

3 articles tagged with #linear-probes. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

3 articles

AINeutralarXiv – CS AI · Mar 47/102

🧠

No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes

Researchers developed linear probes that can predict whether large language models will answer questions correctly by analyzing neural activations before any answer is generated. The method works across different model sizes and generalizes to out-of-distribution datasets, though it struggles with mathematical reasoning tasks.

AINeutralarXiv – CS AI · Mar 96/10

🧠

Probing Visual Concepts in Lightweight Vision-Language Models for Automated Driving

Researchers analyzed Vision-Language Models (VLMs) used in automated driving to understand why they fail on simple visual tasks. They identified two failure modes: perceptual failure where visual information isn't encoded, and cognitive failure where information is present but not properly aligned with language semantics.

AINeutralarXiv – CS AI · Mar 37/108

🧠

Decoding Answers Before Chain-of-Thought: Evidence from Pre-CoT Probes and Activation Steering

New research reveals that large language models often determine their final answers before generating chain-of-thought reasoning, challenging the assumption that CoT reflects the model's actual decision process. Linear probes can predict model answers with 0.9 AUC accuracy before CoT generation, and steering these activations can flip answers in over 50% of cases.