y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#llm-analysis News & Analysis

8 articles tagged with #llm-analysis. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

8 articles
AIBullisharXiv – CS AI · May 77/10
🧠

Feature Identification via the Empirical NTK

Researchers demonstrate that eigenanalysis of the empirical neural tangent kernel (eNTK) can identify learned feature directions in neural networks, from simple MLPs to large language models like Gemma-3-270M. The method shows strong alignment with known algorithmic features in modular arithmetic tasks and grammatical features in language models, outperforming PCA-based approaches and offering a new mechanistic interpretability tool.

AINeutralarXiv – CS AI · 4d ago6/10
🧠

Understanding LLM Behavior in Multi-Target Cross-Lingual Summarization

Researchers introduce MEA, a new benchmark for multi-target cross-lingual summarization (MTXLS) covering 24 languages, and reveal that LLMs perform this task substantially worse than English monolingual summarization. A novel layer-wise analysis shows that translation and summarization behaviors emerge jointly in later layers rather than as separate stages, enabling a new activation steering method that improves MTXLS quality across languages.

AINeutralarXiv – CS AI · 4d ago6/10
🧠

Who Annotates in NLP? A Large-scale Assessment of Human Annotation Reporting between 2018 and 2025

A comprehensive audit of 1,603 NLP papers from 2018-2025 reveals that while researchers increasingly report operational annotation details like recruitment and expertise, critical information for assessing data validity—such as annotator training, language proficiency, compensation, and inter-annotator agreement—remains frequently omitted. The study establishes a scalable framework and reporting taxonomy to improve reproducibility and reliability in NLP research.

AINeutralarXiv – CS AI · 5d ago6/10
🧠

Developing a UXR Point of View for Cognitive Accessibility in Mobile Learning with Generative AI

Researchers developed a UX research framework combining the Point-of-View pyramid methodology with Large Language Model analysis to improve mobile learning requirements for users with cognitive disabilities. The study identifies that usability challenges often stem from ambiguous requirements rather than interface design flaws, proposing a Cognitive Accessibility UXR Playbook to embed accessibility principles into measurable, technically traceable specifications.

AINeutralarXiv – CS AI · Apr 206/10
🧠

LLMbench: A Comparative Close Reading Workbench for Large Language Models

LLMbench is a new browser-based tool that enables detailed comparative analysis of large language model outputs through side-by-side visualization and token-level probability inspection. Unlike existing quantitative comparison tools, it applies digital humanities methodology to make the probabilistic structure of LLM-generated text legible through multiple analytical overlays and visualization modes.

AIBullisharXiv – CS AI · Apr 106/10
🧠

Improving Robustness In Sparse Autoencoders via Masked Regularization

Researchers propose a masked regularization technique to improve the robustness and interpretability of Sparse Autoencoders (SAEs) used in large language model analysis. The method addresses feature absorption and out-of-distribution performance failures by randomly replacing tokens during training to disrupt co-occurrence patterns, offering a practical path toward more reliable mechanistic interpretability tools.

AINeutralarXiv – CS AI · Apr 106/10
🧠

SkillSieve: A Hierarchical Triage Framework for Detecting Malicious AI Agent Skills

Researchers introduced SkillSieve, a three-layer detection framework that identifies malicious AI agent skills in OpenClaw's ClawHub marketplace, where 13-26% of over 13,000 skills contain security vulnerabilities. The system combines regex/AST scanning, LLM-based analysis with parallel sub-tasks, and multi-LLM voting to achieve 0.800 F1 score at $0.006 per skill, significantly outperforming existing detection methods.

AINeutralarXiv – CS AI · Mar 54/10
🧠

Causality Elicitation from Large Language Models

Researchers propose a new pipeline to extract causal relationships from large language models by sampling documents, identifying events, and using causal discovery methods. The approach aims to reveal the causal hypotheses that LLMs assume rather than establishing real-world causality.