AINeutralarXiv – CS AI · 4d ago6/10
🧠AMARIS is a new system that improves how large language models are trained using reinforcement learning by maintaining a persistent memory of past training data and failures. Unlike existing methods that only look at immediate, local information, AMARIS tracks recurring problems and previous rubric adjustments over time, achieving measurable performance improvements across multiple domains.
AINeutralarXiv – CS AI · 4d ago6/10
🧠Researchers introduce CUDAnalyst, a new analysis framework that reveals how large language models make planning decisions when generating CUDA kernels by decomposing feedback signals. The study demonstrates that explicit planning helps only when feedback is well-aligned and that effective planning emerges from structured multi-feedback interactions, with findings showing robustness across different models and workloads.
AINeutralarXiv – CS AI · 4d ago6/10
🧠Researchers introduce N2I-RAG, an AI framework that automates computation of legal indicators from normative texts using retrieval-augmented generation with built-in validation mechanisms. The system addresses hallucination risks in traditional language models by emphasizing traceability and evidence grounding, demonstrating strong performance on French marine environmental law.
AINeutralarXiv – CS AI · 4d ago6/10
🧠Researchers have developed BioFact-MoE, a machine learning framework that uses specialized expert networks to separately analyze liver and tumor factors in hepatocellular carcinoma prognosis. The model achieves superior survival prediction accuracy (75%+ AUC at 12-18 months) while providing interpretable biological insights into treatment heterogeneity.
AINeutralarXiv – CS AI · 4d ago5/10
🧠Uniboost is a new traffic allocation framework for recommendation systems that uses posterior value alignment and linear boosting to improve interpretability and efficiency in allocating traffic across business objectives. The system reduces score inflation and decouples allocation plans, demonstrating improved performance in online A/B tests with practical applications for large-scale industrial recommendation systems.
🏢 Meta
AINeutralarXiv – CS AI · 4d ago6/10
🧠Researchers introduce SL-BiLEM, a machine learning framework that improves epidemic forecasting by accounting for how human behavior changes in response to disease spread and policy interventions. The model uses physical constraints to maintain accuracy even when facing novel policy scenarios, demonstrating 76% improvement over existing neural baselines and potential applications for public health decision-making.
AINeutralarXiv – CS AI · 4d ago6/10
🧠Researchers introduce EEG-FM-Audit, a comprehensive evaluation framework for EEG Foundation Models that reveals properly-tuned supervised baselines can match or exceed state-of-the-art FMs with significantly fewer parameters. The study demonstrates that learning paradigm effectiveness depends heavily on dataset scale and architecture, while introducing neurophysiological probing to improve model interpretability.
🏢 Meta
AINeutralarXiv – CS AI · 4d ago6/10
🧠Researchers introduce Recon, a method for improving user modeling by evaluating synthesized reasoning traces through action reconstruction rather than post-hoc rationalization. The approach achieves 54.7% win rates over baseline methods and demonstrates that reasoning should naturally elicit predicted actions from context, advancing AI's ability to simulate human behavior.
AINeutralarXiv – CS AI · 4d ago6/10
🧠Researchers propose a representation-readout decomposition framework that explains anomalous neural network training phenomena like grokking and double descent by analyzing two competing learning processes: representation learning in encoders and readout calibration in classifiers. The framework provides task-agnostic diagnostics that reveal these phenomena stem from fluctuations in relative learning speeds rather than mysterious delays, challenging existing lazy-to-rich learning theories.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers studying one-layer Transformers discovered that architectural choices in feedforward networks (FFNs)—particularly sparse mixture-of-experts (MoE) routing—fundamentally reshape how attention mechanisms learn to compute, with sparsity rather than learned specialization driving this computational redistribution.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers introduce OracleTSC, an LLM-based traffic signal control system that combines reward hurdle mechanisms and uncertainty regularization to stabilize reinforcement learning training. The approach achieves 75% reduction in travel time while maintaining interpretability through natural language explanations, with strong cross-intersection generalization capabilities.
AINeutralarXiv – CS AI · May 125/10
🧠Researchers establish connections between Consistency-Based Diagnosis (CBD) and Actual Causality frameworks within Explainable AI (XAI), addressing a gap in how diagnosis systems explain their outputs. This theoretical work bridges two previously disconnected areas in AI research, with potential applications for making data management systems more interpretable and trustworthy.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers demonstrate that language models develop semantic role understanding (who-did-what-to-whom comprehension) primarily during pre-training, though fine-tuning still improves performance. Using linear probes on frozen transformer models, they find semantic role information emerges from language modeling objectives alone, with representation structure becoming more distributed as models scale.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers introduce Probabilistic Logical Knowledge Tracing (PLKT), an interpretable AI framework that uses Beta-distributed probabilistic embeddings to model student knowledge states and predict learning performance. Unlike conventional deep learning approaches that rely on opaque deterministic embeddings, PLKT constructs transparent reasoning paths showing how past interactions influence predictions while maintaining superior accuracy compared to existing methods.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers introduce MarsTSC, a novel framework combining Vision Language Models with agentic reasoning for few-shot multimodal time series classification. The system uses collaborative AI roles—Generator, Reflector, and Modifier—to iteratively refine knowledge and improve classification accuracy across 12 benchmarks while providing interpretable explanations.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers introduce an anchor-projection framework that enables behavioral directions to transfer across different large language model families by mapping their diverse hidden representations into a shared coordinate space. The approach achieves high cross-model alignment (0.83 ten-way detection accuracy) without fine-tuning, demonstrating that interpretability and control mechanisms can be standardized across architecturally different models.
🧠 Llama
AIBullisharXiv – CS AI · May 126/10
🧠Researchers introduce E-TCAV, an optimized version of TCAV that improves the efficiency and stability of neural network interpretability testing by leveraging penultimate layer representations. The method achieves linear speed-ups while maintaining accuracy, advancing practical tools for model debugging and real-time concept-guided training across vision and language tasks.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers present Hierarchical Causal Abduction (HCA), a framework that makes Model Predictive Control decisions interpretable by combining physics-informed reasoning, optimization evidence, and causal discovery. The method achieves 53% higher explanation accuracy than existing approaches across industrial control applications, addressing a critical barrier to deploying AI in safety-critical infrastructure.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers found that machine learning models trained on elite European football leagues lose interpretability and reliability when applied to university-level competition, suggesting that performance insights don't transfer across competition tiers. The study reveals that explanation stability and feature importance hierarchies are domain-dependent, challenging the assumption that ML-derived performance determinants are universally applicable.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers analyzed how Qwen3-VL-8B, a multimodal transformer, encodes visual interestingness—a measure derived from human engagement data—without explicit supervision. Using neuroscience-inspired methods, they found that the model's internal representations align with human-derived interestingness scores, suggesting transformers may capture principles of human attention and perception.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers developed an explainable machine learning framework that uses unsupervised and supervised learning to identify and interpret dietary patterns from UK nutrition survey data. The system discovered four distinct eating patterns and achieved high accuracy in reproducing classifications, with potential applications for dietitian-assisted clinical assessments and personalized nutrition counseling.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers demonstrate that early layers of cohort-trained Implicit Neural Representations (INRs) encode transferable features for signal fitting, identifying optimal freezing points through weight stable rank analysis. Using sparse autoencoders for mechanistic interpretability, they reveal that SIREN and Fourier-feature MLPs learn fundamentally different dictionary representations despite comparable performance, with implications for designing more generalizable neural architectures.
AIBullisharXiv – CS AI · May 126/10
🧠Researchers introduce CAMAL, a method that leverages segmentation masks to improve attention alignment and faithfulness in vision models across deep learning and reinforcement learning paradigms. The approach achieves over 35% improvements in attention faithfulness while maintaining or improving generalization performance without additional inference costs.
AIBullisharXiv – CS AI · May 126/10
🧠Researchers introduce Lattice Deduction Transformers (LDT), a specialized neural architecture that achieves near-perfect accuracy on constraint-solving puzzles like Sudoku and Mazes while remaining logically sound. The approach demonstrates that smaller models with domain-specific architectures can outperform large language models on reasoning tasks.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers propose Bridge Matching, a novel framework that decomposes stochastic generative model dynamics into deterministic transport and diffusion-induced osmotic effects. This decomposition enables more interpretable and controllable generative sampling by separately parameterizing how probability mass moves versus how stochastic fluctuations affect the process.