🧠

AI

21,532 AI articles curated from 50+ sources with AI-powered sentiment analysis, importance scoring, and key takeaways.

21532 articles

AINeutralarXiv – CS AI · Jun 16/10

🧠

Spectral Collapse Drives Loss of Plasticity in Deep Continual Learning

Researchers identify that deep neural networks lose plasticity during continual learning due to Hessian spectral collapse, where curvature information vanishes and prevents gradient-based optimization. The study proposes regularization techniques combining high effective feature rank maintenance and L2 penalties to preserve learning capacity across sequential tasks.

AINeutralarXiv – CS AI · Jun 16/10

🧠

PAC-Bayesian Reinforcement Learning Trains Generalizable Policies

Researchers have developed a novel PAC-Bayesian generalization bound for reinforcement learning that addresses the sequential data dependencies problem, enabling non-vacuous generalization certificates for off-policy algorithms like Soft Actor-Critic. The work introduces PB-SAC, an algorithm that leverages this bound to guide exploration while maintaining competitive performance on continuous control tasks.

AIBullisharXiv – CS AI · Jun 16/10

🧠

Boundary-Guided Policy Optimization for Memory-efficient RL of Diffusion Large Language Models

Researchers propose Boundary-Guided Policy Optimization (BGPO), a memory-efficient reinforcement learning algorithm for diffusion large language models that addresses a critical bottleneck in likelihood function approximation. By constructing a specially designed lower bound that enables gradient accumulation across samples while maintaining mathematical equivalence to traditional objectives, BGPO achieves superior performance on math, coding, and planning tasks with significantly reduced memory overhead.

AINeutralarXiv – CS AI · Jun 16/10

🧠

CaptionFormer: Unified Segmentation, Tracking, and Captioning for Spatio-Temporal Objects

Researchers introduce CaptionFormer, an end-to-end model that simultaneously detects, segments, tracks, and captions objects in video sequences. The work addresses Dense Video Object Captioning by generating synthetic training data using vision-language models and extends existing datasets, achieving state-of-the-art results across multiple benchmarks.

AIBullisharXiv – CS AI · Jun 16/10

🧠

Mixture of Horizons in Action Chunking

Researchers propose Mixture of Horizons (MoH), a novel technique for vision-language-action models in robotics that processes action sequences at multiple time scales simultaneously to balance long-term planning with short-term precision. The method achieves state-of-the-art performance on robotic manipulation tasks, reaching 99% success rate on LIBERO benchmarks while enabling 2.5x faster inference through adaptive horizon selection.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Reasoning-Aware Multimodal Fusion for Hateful Video Detection

Researchers introduce RAMF (Reasoning-Aware Multimodal Fusion), a machine learning framework designed to detect hateful content in videos by combining visual, audio, and textual data with adversarial reasoning. The method achieves 3-7% performance improvements over existing approaches, addressing the challenge of identifying nuanced hate speech in increasingly complex online video content.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Conditional Coverage Diagnostics for Conformal Prediction

Researchers introduce Excess Risk of Target Coverage (ERT), a new metric framework for evaluating conditional coverage in conformal prediction systems. The approach reformulates coverage assessment as a classification problem, providing more statistically powerful diagnostics than existing methods while offering conservative estimates of miscoverage and enabling distinction between over- and under-coverage effects.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Researchers propose Bottom-up Policy Optimization (BuPO), a novel reinforcement learning approach that optimizes internal layers of language models rather than treating them as unified policies. The study reveals that LLMs contain distinct internal policy structures with different entropy patterns across layers, offering new insights into how transformer-based models process reasoning tasks.

🧠 Llama

AINeutralarXiv – CS AI · Jun 16/10

🧠

FEM-Bench: A Structured Scientific Reasoning Benchmark for Evaluating Code-Generating LLMs

Researchers introduce FEM-Bench, a scientific reasoning benchmark designed to evaluate large language models' ability to generate correct finite element method (FEM) code for computational mechanics problems. Despite the simplicity of introductory-level tasks, current state-of-the-art LLMs show inconsistent performance, with Gemini 3 Pro completing 30/33 tasks at least once and GPT-5 achieving 73.8% success on unit test writing.

🧠 GPT-5🧠 Gemini

AINeutralarXiv – CS AI · Jun 16/10

🧠

Rethinking Multimodal Few-Shot 3D Point Cloud Segmentation: From Fused Refinement to Decoupled Arbitration

Researchers present DA-FSS, a new deep learning model that improves 3D point cloud segmentation by decoupling semantic and geometric processing paths rather than fusing them together. The approach addresses fundamental limitations in existing multimodal few-shot learning methods, demonstrating superior performance on standard benchmark datasets.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Performance and Complexity Trade-off Optimization of Speech Models During Training

Researchers propose a novel reparameterization technique using feature noise injection that enables joint optimization of speech model performance and computational complexity during training via gradient descent. Unlike post-hoc methods like pruning or quantization, this approach dynamically optimizes model size without heuristic weight-selection criteria, demonstrated through voice activity detection and audio anti-spoofing applications.

AINeutralarXiv – CS AI · Jun 15/10

🧠

SKETCH: Semantic Key-Point Conditioning for Long-Horizon Vessel Trajectory Prediction

Researchers propose SKETCH, a semantic key-point-conditioned framework that improves long-horizon vessel trajectory prediction by decomposing the problem into high-level navigational intent and local motion modeling. The method outperforms existing approaches on real-world AIS data, particularly for extended time horizons and directional accuracy.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Gap-K%: Measuring Top-1 Prediction Gap for Detecting Pretraining Data

Researchers propose Gap-K%, a novel method for detecting whether text was part of an LLM's pretraining data by analyzing the probability gap between a model's top prediction and the actual target token. The technique outperforms existing approaches on standard benchmarks and addresses critical privacy and copyright concerns surrounding the opaque datasets used to train large language models.

AINeutralarXiv – CS AI · Jun 16/10

🧠

ParalESN: Enabling parallel information processing in Reservoir Computing

Researchers introduce Parallel Echo State Network (ParalESN), a novel machine learning architecture that enables parallel processing of temporal data while maintaining the theoretical guarantees of traditional Reservoir Computing. The innovation delivers orders of magnitude in computational savings without sacrificing predictive accuracy, offering a scalable pathway for integrating reservoir computing with modern deep learning systems.

AIBullisharXiv – CS AI · Jun 16/10

🧠

The Gaussian-Head OFL Family: One-Shot Federated Learning from Client Global Statistics

Researchers introduce Gaussian-Head OFL, a one-shot federated learning method that reduces communication overhead to a single round by transmitting only statistical summaries instead of full models. The approach combines closed-form Gaussian classifiers with synthetic data generation, achieving competitive accuracy while maintaining privacy and eliminating dependency on public datasets.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Mixture of Concept Bottleneck Experts

Researchers introduce Mixture of Concept Bottleneck Experts (M-CBE), a framework that enhances interpretable AI by allowing multiple expert expressions to map concepts to predictions rather than a single predetermined function. The approach combines Linear M-CBE and Symbolic M-CBE variants to improve both accuracy and adaptability while maintaining human-understandable decision-making processes.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Stop the Flip-Flop: Context-Preserving Verification for Fast Revocable Diffusion Decoding

Researchers introduce COVER, a new verification technique for diffusion language models that eliminates inefficient token oscillations during parallel decoding. By using KV cache overrides to preserve context while selectively verifying tokens in a single forward pass, COVER accelerates inference while maintaining output quality.

AINeutralarXiv – CS AI · Jun 16/10

🧠

A Kinetic Energy Perspective of Flow Matching

Researchers introduce Kinetic Path Energy (KPE), a physics-inspired metric for evaluating flow-based generative models that measures the dynamical effort of sampling trajectories. The analysis reveals a non-monotonic relationship between trajectory energy and generation quality, where excessive energy causes memorization rather than genuine generation, leading to a training-free inference method called Kinetic Trajectory Shaping that improves output fidelity.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Inverting Data Transformations via Diffusion Sampling

Researchers introduce TIED (Transformation-Inverting Energy Diffusion), a novel machine learning method that recovers inverse transformations on Lie groups using diffusion sampling. The approach improves neural network robustness to input transformations at test time, with applications in image processing and physics-informed modeling.

AIBullisharXiv – CS AI · Jun 16/10

🧠

Breaking the Simplification Bottleneck in Amortized Neural Symbolic Regression

Researchers introduce SimpliPy, a rule-based simplification engine that accelerates symbolic regression by 100x compared to SymPy, enabling the amortized neural symbolic regression method Flash-ANSR to match state-of-the-art genetic programming approaches while producing more concise expressions.

AINeutralarXiv – CS AI · Jun 16/10

🧠

A Behavioural and Representational Evaluation of Goal-Directedness in Language Model Agents

Researchers propose a novel framework combining behavioral and interpretability analyses to evaluate goal-directedness in language model agents. Testing an LLM navigating a 2D grid world, they find the model encodes spatial representations and multi-step plans internally while maintaining robust performance across varying task difficulties, revealing that introspective examination is necessary to fully understand how AI systems represent and pursue objectives.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Effective Reasoning Chains Reduce Intrinsic Dimensionality

Researchers demonstrate that effective chain-of-thought reasoning reduces intrinsic dimensionality—the minimum number of model dimensions needed to achieve target accuracy—offering a quantifiable metric for understanding why reasoning strategies improve language model generalization. Testing on GSM8K with Gemma models reveals strong inverse correlation between lower intrinsic dimensionality and better performance on both in-distribution and out-of-distribution tasks.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Weight Decay Improves Language Model Plasticity

Researchers demonstrate that weight decay during language model pretraining significantly improves model plasticity—the ability to adapt to downstream tasks through fine-tuning. The study reveals counterintuitive findings where higher weight decay produces weaker base models but stronger performance after task-specific training, challenging conventional approaches to hyperparameter optimization.

AINeutralarXiv – CS AI · Jun 16/10

🧠

SCOPE: Selective Conformal Optimized Pairwise LLM Judging

Researchers introduce SCOPE, a framework that improves LLM-based pairwise evaluation by calibrating confidence thresholds to control error rates. Combined with a new uncertainty metric called Bidirectional Preference Entropy (BPE), the approach achieves reliable judgment quality while accepting significantly more evaluations than existing methods.

AINeutralarXiv – CS AI · Jun 16/10

🧠

DTBench: A Synthetic Benchmark for Document-to-Table Extraction

Researchers introduce DTBench, a synthetic benchmark for evaluating large language models on document-to-table extraction tasks. Using a reverse Table2Doc synthesis approach with multi-agent workflows, the benchmark covers 13 subcategories across 5 major capability areas, revealing significant performance gaps and persistent challenges in reasoning and conflict resolution across mainstream LLMs.

← PrevPage 350 of 862Next →