🧠

AI

12,712 AI articles curated from 50+ sources with AI-powered sentiment analysis, importance scoring, and key takeaways.

12712 articles

AIBullisharXiv – CS AI · Apr 146/10

🧠

Automating Structural Analysis Across Multiple Software Platforms Using Large Language Models

Researchers developed a multi-agent LLM system that automates structural analysis workflows across multiple finite element analysis (FEA) platforms including ETABS, SAP2000, and OpenSees. Using a two-stage architecture that interprets engineering specifications and translates them into platform-specific code, the system achieved over 90% accuracy in 20 representative frame problems, addressing a critical gap in practical AI-assisted engineering deployment.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Relational Preference Encoding in Looped Transformer Internal States

Researchers demonstrate that looped transformers like Ouro-2.6B encode human preferences relationally rather than independently, with pairwise evaluators achieving 95.2% accuracy compared to 21.75% for independent classification. The study reveals that preference encoding is fundamentally relational, functioning as an internal consistency probe rather than a direct predictor of human annotations.

🏢 Anthropic

AINeutralarXiv – CS AI · Apr 146/10

🧠

Should We be Pedantic About Reasoning Errors in Machine Translation?

Researchers identified systematic reasoning errors in machine translation systems across seven language pairs, finding that while these errors can be detected with high precision in some languages like Urdu, correcting them produces minimal improvements in translation quality. This suggests that reasoning traces in neural machine translation models lack genuine faithfulness to their outputs, raising questions about the reliability of reasoning-based approaches in translation systems.

AINeutralarXiv – CS AI · Apr 146/10

🧠

From UAV Imagery to Agronomic Reasoning: A Multimodal LLM Benchmark for Plant Phenotyping

Researchers have developed PlantXpert, a multimodal AI benchmark for evaluating vision-language models on agricultural phenotyping tasks for soybean and cotton. The benchmark tests 11 state-of-the-art models across disease detection, pest control, weed management, and yield prediction, revealing that fine-tuned models achieve up to 78% accuracy but struggle with complex reasoning and cross-crop generalization.

AINeutralarXiv – CS AI · Apr 146/10

🧠

The Rise and Fall of $G$ in AGI

Researchers apply psychometric analysis to large language model benchmarks, discovering that AI's general intelligence factor (G-factor) peaked around 2023-2024 before fragmenting as models specialized in reasoning tasks. The finding suggests AI development is shifting from unified capability improvement toward specialized tool-using systems, challenging assumptions about monolithic AGI progress.

AINeutralarXiv – CS AI · Apr 146/10

🧠

A Minimal Model of Representation Collapse: Frustration, Stop-Gradient, and Dynamics

Researchers present a minimal mathematical model demonstrating how representation collapse occurs in self-supervised learning when frustrated (misclassified) samples exist, and show that stop-gradient techniques prevent this failure mode. The work provides closed-form analysis of gradient-flow dynamics and fixed points, offering theoretical insights into why modern embedding-based learning systems sometimes lose discriminative power.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Like a Hammer, It Can Build, It Can Break: Large Language Model Uses, Perceptions, and Adoption in Cybersecurity Operations on Reddit

A research study analyzing 892 Reddit posts from cybersecurity forums reveals how security practitioners currently use, perceive, and adopt large language models in Security Operations Centers. While practitioners leverage LLMs for productivity gains in low-risk tasks, significant concerns about reliability, verification overhead, and security risks prevent broader autonomous deployment in critical security operations.

AIBullisharXiv – CS AI · Apr 146/10

🧠

CoSToM:Causal-oriented Steering for Intrinsic Theory-of-Mind Alignment in Large Language Models

Researchers introduce CoSToM, a framework that uses causal tracing and activation steering to improve Theory of Mind alignment in large language models. The work addresses a critical gap between LLMs' internal knowledge and external behavior, demonstrating that targeted interventions in specific neural layers can enhance social reasoning capabilities and dialogue quality.

AIBullisharXiv – CS AI · Apr 146/10

🧠

Closed-Form Concept Erasure via Double Projections

Researchers present a novel closed-form method for concept erasure in generative AI models that removes unwanted concepts without iterative training. The technique uses linear transformations and two sequential projection steps to safely edit pretrained models like Stable Diffusion and FLUX while preserving unrelated concepts, completing the process in seconds.

🧠 Stable Diffusion

AINeutralarXiv – CS AI · Apr 146/10

🧠

ASPIRin: Action Space Projection for Interactivity-Optimized Reinforcement Learning in Full-Duplex Speech Language Models

Researchers propose ASPIRin, a reinforcement learning framework that improves full-duplex speech language models by separating turn-taking decisions from semantic generation. The method reduces repetitive output by over 50% compared to standard approaches while maintaining natural conversational dynamics.

AIBullisharXiv – CS AI · Apr 146/10

🧠

Degradation-Consistent Paired Training for Robust AI-Generated Image Detection

Researchers propose Degradation-Consistent Paired Training (DCPT), a training methodology that significantly improves AI-generated image detector robustness against real-world corruptions like JPEG compression and blur. The approach uses paired consistency constraints without adding parameters or inference overhead, achieving 9.1% accuracy improvement on degraded images while maintaining performance on clean images.

AINeutralarXiv – CS AI · Apr 146/10

🧠

From Helpful to Trustworthy: LLM Agents for Pair Programming

Doctoral research proposes a systematic framework for multi-agent LLM pair programming that improves code reliability and auditability through externalized intent and iterative validation. The study addresses critical gaps in how AI coding agents can produce trustworthy outputs aligned with developer objectives across testing, implementation, and maintenance workflows.

AINeutralarXiv – CS AI · Apr 146/10

🧠

CodaRAG: Connecting the Dots with Associativity Inspired by Complementary Learning

Researchers introduce CodaRAG, a framework that enhances Retrieval-Augmented Generation by treating evidence retrieval as active associative discovery rather than passive lookup. The system achieves 7-10% gains in retrieval recall and 3-11% improvements in generation accuracy by consolidating fragmented knowledge, navigating multi-dimensional pathways, and eliminating noise.

AINeutralarXiv – CS AI · Apr 146/10

🧠

A Queueing-Theoretic Framework for Dynamic Attack Surfaces: Data-Integrated Risk Analysis and Adaptive Defense

Researchers develop a queueing-theoretic framework that models cyber-attack surfaces as dynamic systems where vulnerabilities arrive and depart over time. Using reinforcement learning and Markov decision processes, they demonstrate an adaptive defense strategy that reduces active vulnerabilities by over 90% in software supply chains without increasing maintenance budgets.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Toward Accountable AI-Generated Content on Social Platforms: Steganographic Attribution and Multimodal Harm Detection

Researchers propose a steganography-based attribution framework that embeds cryptographic identifiers into AI-generated images to combat harmful misuse on social platforms. The system combines watermarking techniques with CLIP-based multimodal detection to achieve 0.99 AUC-ROC performance, enabling reliable forensic tracing of synthetic media used in misinformation campaigns.

AINeutralarXiv – CS AI · Apr 146/10

🧠

AI Patents in the United States and China: Measurement, Organization, and Knowledge Flows

Researchers developed an advanced AI classifier achieving 97% precision in identifying AI patents, revealing that both the U.S. and China are rapidly expanding AI innovation but through fundamentally different institutional structures. While China recently surpassed the U.S. in annual patent volume, American AI patenting remains concentrated among large private firms, whereas Chinese innovation is more geographically dispersed across universities and state-owned enterprises.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Towards an Appropriate Level of Reliance on AI: A Preliminary Reliance-Control Framework for AI in Software Engineering

Researchers propose a reliance-control framework for AI tools in software development, based on interviews with 22 developers using LLMs. The study addresses the tension between overreliance (risking skill atrophy) and underreliance (missing productivity gains), offering guidance for developers, educators, and policymakers on appropriate AI tool usage.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Machine Learning-Based Detection of MCP Attacks

Researchers developed machine learning models to detect malicious Model Context Protocol (MCP) attacks, achieving up to 100% F1-score on binary classification and 90.56% on multiclass detection tasks. The study addresses a critical security gap in MCP technology, which extends LLM capabilities but introduces new attack surfaces, and includes a middleware solution for real-world deployment.

AINeutralarXiv – CS AI · Apr 146/10

🧠

LLMs Should Incorporate Explicit Mechanisms for Human Empathy

Researchers argue that Large Language Models lack explicit empathy mechanisms, systematically failing to preserve human perspectives, affect, and context despite strong benchmark performance. The paper identifies four recurring empathic failures—sentiment attenuation, granularity mismatch, conflict avoidance, and linguistic distancing—and proposes empathy-aware objectives as essential components of LLM development.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Early Decisions Matter: Proximity Bias and Initial Trajectory Shaping in Non-Autoregressive Diffusion Language Models

Researchers identify a critical failure mode in non-autoregressive diffusion language models caused by proximity bias, where the denoising process concentrates on adjacent tokens, creating spatial error propagation. They propose a minimal-intervention approach using a lightweight planner and temperature annealing to guide early token selection, achieving substantial improvements on reasoning and planning tasks.

AIBearisharXiv – CS AI · Apr 146/10

🧠

Calibration Collapse Under Sycophancy Fine-Tuning: How Reward Hacking Breaks Uncertainty Quantification in LLMs

A research study demonstrates that fine-tuning language models with sycophantic reward signals degrades their calibration—the ability to accurately quantify uncertainty—even as performance metrics improve. While the effect lacks statistical significance in this experiment, the findings reveal that reward-optimized models retain structured miscalibration even after post-hoc corrections, establishing a methodology for evaluating hidden degradation in fine-tuned systems.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Bridging Linguistic Gaps: Cross-Lingual Mapping in Pre-Training and Dataset for Enhanced Multilingual LLM Performance

Researchers introduce a Cross-Lingual Mapping Task during LLM pre-training to improve multilingual performance across languages with varying data availability. The method achieves significant improvements in machine translation, cross-lingual question answering, and multilingual understanding without requiring extensive parallel data.

AIBullisharXiv – CS AI · Apr 146/10

🧠

NSFL: A Post-Training Neuro-Symbolic Fuzzy Logic Framework for Boolean Operators in Neural Embeddings

Researchers introduce Neuro-Symbolic Fuzzy Logic (NSFL), a training-free framework that enables neural embedding systems to perform complex logical operations without retraining. The approach combines fuzzy logic mathematics with neural embeddings, achieving up to 81% mAP improvements across multiple encoder configurations and demonstrating broad applicability to existing AI retrieval systems.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Computational Lesions in Multilingual Language Models Separate Shared and Language-specific Brain Alignment

Researchers used computational lesions on multilingual large language models to identify how the brain processes language across different languages. By selectively disabling parameters, they found that a shared computational core handles 60% of multilingual processing, while language-specific components fine-tune predictions for individual languages, providing new insights into how multilingual AI aligns with human neurobiology.

AIBullisharXiv – CS AI · Apr 146/10

🧠

Efficient Process Reward Modeling via Contrastive Mutual Information

Researchers propose CPMI, an automated method for training process reward models that reduces annotation costs by 84% and computational overhead by 98% compared to traditional Monte Carlo approaches. The technique uses contrastive mutual information to assign reward scores to reasoning steps in AI chain-of-thought trajectories without expensive human annotation or repeated LLM rollouts.

← PrevPage 146 of 509Next →