Models, papers, tools. 39,893 articles with AI-powered sentiment analysis and key takeaways.
AINeutralarXiv – CS AI · Jun 96/10
🧠Researchers successfully modernized NMAP-RKPM, a 60,000-line Fortran physics simulation engine, from single-threaded MPI to parallel C++ using a structured agentic AI approach. Rather than relying on LLMs alone, the team developed a 'hand-holding' methodology combining manual examples, continuous buildability checks, and scoped sessions that proved highly effective for legacy code transformation.
AINeutralarXiv – CS AI · Jun 96/10
🧠Researchers introduce SNR-ST-Mix, a data augmentation framework designed specifically for spatial transcriptomics that uses geometry-aware and expression-aware mixing to improve deep neural network performance. The method constrains data interpolation to k-nearest spatial neighbors and weights coefficients by expression similarity, enabling more biologically plausible synthetic training samples that enhance prediction accuracy without architectural changes.
AINeutralarXiv – CS AI · Jun 96/10
🧠Researchers demonstrate that direct neural network approaches fail for controlling highly unstable tilt-rotor systems, but propose a hybrid solution combining sliding mode control with neural networks to predict system dynamics. The LSTM-based implementation outperforms traditional methods while reducing computational overhead, advancing autonomous aerial vehicle control capabilities.
AINeutralarXiv – CS AI · Jun 96/10
🧠Researchers propose Deep Active Re-Labeling (DARL), a framework addressing human annotation errors in deep active learning by allocating budget to re-annotate potentially mislabeled data. The method uses noise detection strategies to identify suspect instances, improving data quality and model performance under annotation noise.
AINeutralarXiv – CS AI · Jun 96/10
🧠RadOT-Eval is a new AI framework that uses optimal transport algorithms to automatically evaluate radiology report generation by decomposing reports into structured clinical evidence units and detecting specific error types like omissions, hallucinations, and polarity reversals. The method achieves higher correlation with clinician-annotated errors than existing metrics and LLM-based evaluators, providing an auditable approach for quality assurance in high-stakes medical AI applications.
AINeutralarXiv – CS AI · Jun 95/10
🧠Researchers presented a study on detecting hate speech and analyzing sentiment in Nepali-language memes using transformer-based machine learning models and ensemble learning techniques. The work addresses challenges specific to Nepali text analysis, including code-mixing and limited baseline datasets, demonstrating that soft voting ensemble strategies outperform standalone models for multi-class sentiment tasks by 15.8% in Macro F1-score.
AIBullisharXiv – CS AI · Jun 96/10
🧠Researchers introduce WorldDP, a hierarchical framework combining object-centric world models with diffusion policies to enable robots to perform complex multi-stage manipulation tasks. The approach uses high-level planning to generate subgoals that low-level diffusion policies execute, significantly outperforming existing methods on robotic benchmarks.
AINeutralarXiv – CS AI · Jun 96/10
🧠Researchers present a novel methodology for detecting hallucinations in Visual Language Models by measuring sample complexity under counterfactual perturbations. Using circuit discovery techniques and causal influence metrics, they establish empirical bounds on the minimum counterfactual samples needed to reliably identify unstable hallucinated predictions.
AINeutralarXiv – CS AI · Jun 96/10
🧠Researchers present a mathematical framework for auditing black-box algorithmic decision-makers by decomposing cumulative regret into per-period covariances between costs and policy decisions. The model-free approach enables practical auditing of sequential decision systems, with applications to platform mechanisms, repeated games, and algorithmic trading strategies without requiring access to private agent information.
$MKR
AINeutralarXiv – CS AI · Jun 95/10
🧠Researchers propose a closed-loop AI-enhanced architecture for continuous software quality intelligence that integrates requirement analysis, test prioritization, defect prediction, and production incident feedback. Testing on a semi-synthetic dataset demonstrates significant improvements: 35% reduction in test execution time, defect leakage reduction from 0.19 to 0.13, and detection effectiveness improvement from 0.72 to 0.84 across six release cycles.
AINeutralarXiv – CS AI · Jun 96/10
🧠Researchers propose a novel framework combining Lagrangian decomposition with decision-focused learning to improve scalability and computational efficiency in predict-then-optimize problems. The approach demonstrates competitive performance on large-scale benchmarks with up to 8x more variables than previous methods, while maintaining parallelization capabilities.
AINeutralarXiv – CS AI · Jun 96/10
🧠Researchers introduce the Governance-Aware Autonomous Testing Framework (GATF), which adds governance validation, compliance monitoring, and explainability controls to AI-powered software testing systems. The framework achieved 89.6% reduction in governance-related risks and demonstrated high accuracy across multiple performance metrics, addressing critical concerns about AI-generated test artifacts including hallucinations and security vulnerabilities.
AINeutralarXiv – CS AI · Jun 96/10
🧠Researchers demonstrate that simple K-nearest neighbor models leveraging biological knowledge graphs achieve competitive performance in predicting gene knockout effects on transcriptomic expression, with reinforcement learning-optimized LLMs further improving results to match state-of-the-art methods. This work suggests knowledge graphs serve as effective model priors for complex biological prediction tasks.
AINeutralarXiv – CS AI · Jun 95/10
🧠Researchers introduce BLM-SGAN, a novel text-to-image generation model that combines bidirectional language modeling with GANs to improve image synthesis from text descriptions. The model achieves state-of-the-art performance metrics, outperforming existing approaches by better capturing contextual dependencies and reducing training limitations.
AINeutralarXiv – CS AI · Jun 95/10
🧠Researchers present a novel deep neural network approach that combines handwritten character detection and classification into a single task, eliminating the need for manual annotation by using synthetically generated training data. The method achieves 88.28% recognition accuracy on real exam forms, demonstrating superior performance compared to traditional two-stage approaches.
AINeutralarXiv – CS AI · Jun 96/10
🧠Researchers introduce SO-101, a standardized real-world benchmark for evaluating Vision-Language-Action (VLA) models on affordable robotic platforms. The study benchmarks multiple VLA and imitation learning policies, revealing that execution instability is the dominant failure mode and that recovery capabilities vary significantly across architectures, highlighting the gap between simulation-based evaluations and real-world robotic deployment.
AIBullisharXiv – CS AI · Jun 96/10
🧠Researchers have developed a lightweight transformer-based method to detect reward hacking in AI systems that operates at a fraction of the cost of existing approaches. The technique achieves comparable performance to LLM-based judges while demonstrating superior true positive rates, suggesting efficient alternatives to expensive AI evaluation methods are feasible.
AINeutralarXiv – CS AI · Jun 95/10
🧠Researchers propose a new method for few-shot class-variable incremental audio classification that handles both increasing and decreasing numbers of classes, addressing a practical gap in existing models. The approach uses prototype adaptation and pseudo class-variable training to dynamically adjust classifier structure as classes change, demonstrating improved performance on multiple datasets.
AINeutralarXiv – CS AI · Jun 95/10
🧠Researchers propose a two-stage vision-language framework using Qwen3-VL with LoRA fine-tuning to detect semiconductor lithography defects, then employ a refinement module trained on first-stage failures to improve accuracy beyond standard single-stage approaches.
AINeutralarXiv – CS AI · Jun 95/10
🧠PolyBuild introduces an end-to-end deep learning method for extracting building polygon contours directly from high-resolution remote sensing images without post-processing. The hybrid CNN-Transformer architecture combines an Initial Contour Generation Module with a Contour Optimization Module to achieve superior performance over existing mask-based and contour-based approaches.
$MATIC
AINeutralarXiv – CS AI · Jun 96/10
🧠Researchers introduce NormBench, a benchmark with 2,290 legal provisions across multiple languages, and Span-Grounded Deontic Trees (SG-DT), a structured representation method designed to address Silent Scope Omission—where AI systems appear compliant but fail to apply nested exceptions correctly. Testing reveals that frontier LLMs struggle with recursive defeater chains and struggle to assemble correct logical control flow despite retrieving relevant source material.
AIBullisharXiv – CS AI · Jun 96/10
🧠Researchers propose PAI, a novel anomaly scoring scheme that addresses a critical limitation in representation-based time-series anomaly detection by explicitly preserving amplitude information in learned embeddings. The method achieves significant performance improvements, with average gains of 98.4% on TSB-AD-U-Eva and 36.8% on TAB UV datasets, suggesting that amplitude retention is crucial for robust anomaly detection.
AINeutralarXiv – CS AI · Jun 96/10
🧠The CHIIR 2026 Workshop on Generative AI and Academic Search convened researchers to examine how GenAI is transforming academic research systems beyond traditional document retrieval. Discussions centered on three themes—foundations, applications, and search-as-learning—emphasizing human-centered design principles that prioritize research integrity, transparency, and higher-order cognitive support.
AINeutralarXiv – CS AI · Jun 96/10
🧠Researchers introduce PACT, a training framework that enables large language models to master multiple diagnostic reasoning strategies simultaneously for clinical decision-making. The method uses supervised dialogue synthesis with complete medical records and a consensus-based training approach, achieving state-of-the-art performance on a new Chinese medical diagnosis benchmark.
AIBullisharXiv – CS AI · Jun 96/10
🧠Researchers developed NutriMLLM, a specialized family of vision-language models trained on 1.1 million synthetic food images with complete 65-nutrient labels, to accurately estimate dietary micronutrients from photographs. The models outperform existing proprietary systems like GPT-5 and Gemini 3 on most nutrients, addressing a critical gap in clinical nutrition assessment where previous MLLMs frequently failed or produced implausible results.
🧠 GPT-5🧠 Claude🧠 Sonnet