y0news
#ai-interpretability4 articles
4 articles
AINeutralarXiv โ€“ CS AI ยท 4h ago2
๐Ÿง 

The Lattice Representation Hypothesis of Large Language Models

Researchers propose the Lattice Representation Hypothesis, a new framework showing how large language models encode symbolic reasoning through geometric structures. The theory unifies continuous neural representations with formal logic by demonstrating that LLM embeddings naturally form concept lattices that enable symbolic operations through geometric intersections and unions.

AIBullisharXiv โ€“ CS AI ยท 4h ago1
๐Ÿง 

CIRCUS: Circuit Consensus under Uncertainty via Stability Ensembles

Researchers introduce CIRCUS, a new method for discovering mechanistic circuits in AI models that addresses uncertainty and brittleness issues in current approaches. The technique creates ensemble attribution graphs and extracts consensus circuits that are 40x smaller while maintaining explanatory power, validated on Gemma-2-2B and Llama-3.2-1B models.

AINeutralarXiv โ€“ CS AI ยท 4h ago0
๐Ÿง 

Diagnosing Generalization Failures from Representational Geometry Markers

Researchers propose a new approach to predict AI model failures by analyzing geometric properties of data representations rather than reverse-engineering internal mechanisms. They found that reduced manifold dimensionality and utility in training data consistently predict poor performance on out-of-distribution tasks across different architectures and datasets.

AINeutralarXiv โ€“ CS AI ยท 4h ago1
๐Ÿง 

How Well Do Multimodal Models Reason on ECG Signals?

Researchers introduce a new framework for evaluating how well multimodal AI models reason about ECG signals by breaking down reasoning into perception (pattern identification) and deduction (logical application of medical knowledge). The framework uses automated code generation to verify temporal patterns and compares model logic against established clinical criteria databases.