#llm-analysis News & Analysis

12 articles tagged with #llm-analysis. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

12 articles

AIBullisharXiv – CS AI · May 77/10

🧠

Feature Identification via the Empirical NTK

Researchers demonstrate that eigenanalysis of the empirical neural tangent kernel (eNTK) can identify learned feature directions in neural networks, from simple MLPs to large language models like Gemma-3-270M. The method shows strong alignment with known algorithmic features in modular arithmetic tasks and grammatical features in language models, outperforming PCA-based approaches and offering a new mechanistic interpretability tool.

AINeutralarXiv – CS AI · Jun 235/10

🧠

Rebuttals Move Peer-Review Scores, but Initial-Review Structure Bounds the Movement

Researchers analyzed 73,000 reviewer trajectories from ICLR 2024-2025 to measure how author rebuttals affect peer-review scores. Using LLMs as measurement tools, they found that while rebuttals can move scores, initial review structure predicts most score movement, constraining rebuttal impact to measurable but bounded effects.

🧠 Claude🧠 Opus🧠 Gemini

AINeutralarXiv – CS AI · Jun 116/10

🧠

When Probing Accuracy Saturates, Fragility Resolves: A Complementary Metric for LLM Pre-Training Analysis

Researchers introduce 'fragility' as a complementary metric to linear probing for analyzing large language model pre-training, addressing the limitation that probe accuracy saturates early in training and becomes insensitive to ongoing representational changes. By measuring activation noise tolerance levels, fragility reveals structural evolution in how models encode lexical versus compositional information across layers, demonstrating that data curation and architectural choices leave distinct signatures invisible to traditional accuracy metrics.

AINeutralarXiv – CS AI · Jun 96/10

🧠

Contribution Weights: A Geometrical Analysis of Self-Attention Transformers

Researchers introduce Contribution Weights, a new metric for analyzing transformer attention that accounts for value vector geometry alongside attention weights. The approach more accurately identifies semantically critical tokens than traditional attention-based metrics and reveals that attention sinks actively suppress information rather than passively storing excess attention.

AINeutralarXiv – CS AI · Jun 96/10

🧠

Closure-Validated Circuit Discovery in Attention Heads: Co-activation Proposes, Ablation Disposes

Researchers propose a methodology for validating attention-head circuits in large language models by combining co-activation clustering with causal ablation testing. Their findings reveal that while clustering signals identify circuit proposals, true circuit validation requires closure tests that measure functional impact through ablation—a distinction that challenges current interpretability approaches.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Understanding LLM Behavior in Multi-Target Cross-Lingual Summarization

Researchers introduce MEA, a new benchmark for multi-target cross-lingual summarization (MTXLS) covering 24 languages, and reveal that LLMs perform this task substantially worse than English monolingual summarization. A novel layer-wise analysis shows that translation and summarization behaviors emerge jointly in later layers rather than as separate stages, enabling a new activation steering method that improves MTXLS quality across languages.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Who Annotates in NLP? A Large-scale Assessment of Human Annotation Reporting between 2018 and 2025

A comprehensive audit of 1,603 NLP papers from 2018-2025 reveals that while researchers increasingly report operational annotation details like recruitment and expertise, critical information for assessing data validity—such as annotator training, language proficiency, compensation, and inter-annotator agreement—remains frequently omitted. The study establishes a scalable framework and reporting taxonomy to improve reproducibility and reliability in NLP research.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Developing a UXR Point of View for Cognitive Accessibility in Mobile Learning with Generative AI

Researchers developed a UX research framework combining the Point-of-View pyramid methodology with Large Language Model analysis to improve mobile learning requirements for users with cognitive disabilities. The study identifies that usability challenges often stem from ambiguous requirements rather than interface design flaws, proposing a Cognitive Accessibility UXR Playbook to embed accessibility principles into measurable, technically traceable specifications.

AINeutralarXiv – CS AI · Apr 206/10

🧠

LLMbench: A Comparative Close Reading Workbench for Large Language Models

LLMbench is a new browser-based tool that enables detailed comparative analysis of large language model outputs through side-by-side visualization and token-level probability inspection. Unlike existing quantitative comparison tools, it applies digital humanities methodology to make the probabilistic structure of LLM-generated text legible through multiple analytical overlays and visualization modes.

AIBullisharXiv – CS AI · Apr 106/10

🧠

Improving Robustness In Sparse Autoencoders via Masked Regularization

Researchers propose a masked regularization technique to improve the robustness and interpretability of Sparse Autoencoders (SAEs) used in large language model analysis. The method addresses feature absorption and out-of-distribution performance failures by randomly replacing tokens during training to disrupt co-occurrence patterns, offering a practical path toward more reliable mechanistic interpretability tools.

AINeutralarXiv – CS AI · Apr 106/10

🧠

SkillSieve: A Hierarchical Triage Framework for Detecting Malicious AI Agent Skills

Researchers introduced SkillSieve, a three-layer detection framework that identifies malicious AI agent skills in OpenClaw's ClawHub marketplace, where 13-26% of over 13,000 skills contain security vulnerabilities. The system combines regex/AST scanning, LLM-based analysis with parallel sub-tasks, and multi-LLM voting to achieve 0.800 F1 score at $0.006 per skill, significantly outperforming existing detection methods.

AINeutralarXiv – CS AI · Mar 54/10

🧠

Causality Elicitation from Large Language Models

Researchers propose a new pipeline to extract causal relationships from large language models by sampling documents, identifying events, and using causal discovery methods. The approach aims to reveal the causal hypotheses that LLMs assume rather than establishing real-world causality.