9,311 AI articles curated from 50+ sources with AI-powered sentiment analysis, importance scoring, and key takeaways.
AIBullisharXiv – CS AI · Feb 275/106
🧠Researchers propose a new AI inference method that uses invariant transformations and resampling to reduce epistemic uncertainty and improve model accuracy. The approach involves applying multiple transformed versions of an input to a trained AI model and aggregating the outputs for more reliable results.
AIBullisharXiv – CS AI · Feb 276/106
🧠Researchers propose a new approach using Adversarial Inverse Reinforcement Learning for machinery fault detection that learns from healthy operational data without requiring manual fault labels. The framework treats fault detection as a sequential decision-making problem and demonstrates effective early fault detection on three benchmark datasets.
AIBullisharXiv – CS AI · Feb 276/106
🧠Researchers developed ODEBRAIN, a Neural ODE framework that models continuous-time EEG brain dynamics by integrating spatio-temporal-frequency features into spectral graph nodes. The system overcomes limitations of traditional discrete-time models by capturing instantaneous, nonlinear brain characteristics without cumulative prediction errors.
AINeutralarXiv – CS AI · Feb 276/103
🧠Researchers developed CXReasonAgent, a diagnostic AI agent that combines large language models with clinical diagnostic tools to provide evidence-based chest X-ray analysis. The system addresses limitations of current vision-language models that generate plausible but ungrounded medical diagnoses, introducing a new benchmark with 1,946 diagnostic dialogues.
AINeutralarXiv – CS AI · Feb 276/105
🧠Researchers identified stochasticity (variability) as a critical barrier to deploying Deep Research Agents in real-world applications like financial decision-making and medical analysis. The study proposes mitigation strategies that reduce output variance by 22% while maintaining research quality, addressing a key obstacle for enterprise AI agent adoption.
AIBullisharXiv – CS AI · Feb 276/106
🧠Researchers have introduced ESAA (Event Sourcing for Autonomous Agents), a new architecture that improves LLM-based autonomous agents by separating cognitive intention from state mutation using structured JSON events and deterministic orchestration. The system addresses key limitations like context degradation and execution reliability, with successful validation through multi-agent case studies using various LLMs including Claude Sonnet and GPT-5.
AIBullisharXiv – CS AI · Feb 276/106
🧠Researchers have developed PATRA, a new AI model that improves time series question answering by better understanding patterns like trends and seasonality. The model addresses limitations in existing LLM approaches that treat time series data as simple text or images, introducing pattern-aware mechanisms and balanced learning across tasks of varying difficulty.
AINeutralarXiv – CS AI · Feb 276/107
🧠Researchers developed ReCoN-Ipsundrum, an AI agent architecture designed to exhibit consciousness-like behaviors through recurrent persistence loops and affect-coupled control mechanisms. The study demonstrates how engineered systems can display preference stability, exploratory scanning, and sustained caution behaviors that mimic aspects of conscious experience.
$LINK
AIBullisharXiv – CS AI · Feb 276/107
🧠Researchers propose a new approach to generalized planning that learns explicit transition models rather than directly predicting action sequences. This method achieves better out-of-distribution performance with fewer training instances and smaller models compared to Transformer-based planners like PlanGPT.
AIBullisharXiv – CS AI · Feb 276/106
🧠Researchers developed MALLET, a multi-agent AI system that reduces emotional intensity in news content by up to 19.3% while preserving semantic meaning. The system uses four specialized agents to analyze, adjust, and personalize content presentation modes for calmer decision-making without restricting access to original information.
$NEAR
AIBullisharXiv – CS AI · Feb 275/107
🧠Researchers have developed RepSPD, a novel geometric deep learning model that enhances EEG brain activity decoding using symmetric positive definite manifolds and dynamic graphs. The framework introduces cross-attention mechanisms on Riemannian manifolds and bidirectional alignment strategies to improve brain signal representation and analysis.
AIBullisharXiv – CS AI · Feb 276/107
🧠Researchers developed a framework for analyzing AI diagnostic systems in clinical settings by preserving original AI inferences and comparing them with physician corrections. The study of 21 dermatological cases showed 71.4% exact agreement between AI and physicians, with 100% comprehensive concordance when using structured analysis methods.
AINeutralarXiv – CS AI · Feb 276/107
🧠Researchers have developed SPM-Bench, a PhD-level benchmark for testing large language models on scanning probe microscopy tasks. The benchmark uses automated data synthesis from scientific papers and introduces new evaluation metrics to assess AI reasoning capabilities in specialized scientific domains.
AIBullisharXiv – CS AI · Feb 276/108
🧠Researchers have developed FactGuard, an AI framework that uses multimodal large language models and reinforcement learning to detect video misinformation. The system addresses limitations of existing models by implementing iterative reasoning processes and external tool integration to verify information across video content.
AINeutralarXiv – CS AI · Feb 276/106
🧠Researchers published a case study demonstrating successful human-AI collaboration in mathematical research, extending Hermite quadrature rule results beyond manual capabilities. The study reveals AI's strengths in algebraic manipulation and proof exploration, while highlighting the critical need for human verification and domain expertise in every step of the research process.
AIBullisharXiv – CS AI · Feb 275/107
🧠DeepPresenter is a new AI framework for autonomous presentation generation that can plan, render, and revise slides through environment-grounded reflection rather than fixed templates. The system uses perceptual feedback from rendered slides to identify and correct presentation-specific issues, achieving state-of-the-art performance with a competitive 9B parameter model.
AIBearisharXiv – CS AI · Feb 276/107
🧠Researchers developed ClinDet-Bench, a new benchmark that reveals large language models fail to properly identify when they have sufficient information to make clinical decisions. The study shows LLMs make both premature judgments and excessive abstentions in medical scenarios, highlighting safety concerns for AI deployment in healthcare settings.
AIBullisharXiv – CS AI · Feb 276/107
🧠Researchers introduce AMA-Bench, a new benchmark for evaluating long-horizon memory in AI agents deployed in real-world applications. The study reveals existing memory systems underperform due to lack of causality and objective information, while their proposed AMA-Agent system achieves 57.22% accuracy, surpassing baselines by 11.16%.
AINeutralarXiv – CS AI · Feb 276/105
🧠Research analyzing physician disagreement in HealthBench medical AI evaluation dataset finds that 81.8% of disagreement variance is unexplained by observable features, with rubric identity accounting for only 15.8% of variance. The study reveals physicians agree on clearly good or bad AI outputs but disagree on borderline cases, suggesting structural limits to medical AI evaluation consistency.
AIBullisharXiv – CS AI · Feb 276/105
🧠Researchers propose TAESAR, a new data-centric framework for improving recommendation models by transforming mixed-domain data into unified target-domain sequences. The approach uses contrastive decoding to address domain gaps and data sparsity issues, outperforming traditional model-centric solutions while generalizing across various sequential models.
AIBullisharXiv – CS AI · Feb 276/106
🧠Researchers introduce RLHFless, a serverless computing framework for Reinforcement Learning from Human Feedback (RLHF) that addresses resource inefficiencies in training large language models. The system achieves up to 1.35x speedup and 44.8% cost reduction compared to existing solutions by dynamically adapting to resource demands and optimizing workload distribution.
AIBullisharXiv – CS AI · Feb 276/106
🧠Researchers introduce SideQuest, a novel KV cache management system that uses Large Reasoning Models to compress memory usage during long-horizon AI tasks. The system reduces peak token usage by up to 65% while maintaining accuracy by having the model itself determine which tokens are useful to keep in memory.
AIBullisharXiv – CS AI · Feb 276/104
🧠Researchers propose an agentic AI framework using multiple LLM-based agents to optimize cell-free Open RAN networks through intent-driven automation. The system reduces active radio units by 42% in energy-saving mode while cutting memory usage by 92% through parameter-efficient fine-tuning.
AINeutralarXiv – CS AI · Feb 275/102
🧠Researchers propose using cognitive models and AI algorithms as templates for designing modular language agents that combine multiple large language models. The position paper formalizes agent templates that specify roles for individual LLMs and how their functionalities should be composed to solve complex problems beyond single model capabilities.
AIBullisharXiv – CS AI · Feb 276/107
🧠Researchers introduce AHCE (Active Human-Augmented Challenge Engagement), a framework that enables AI agents to collaborate with human experts more effectively through learned policies. The system achieved 32% improvement on normal difficulty tasks and 70% on difficult tasks in Minecraft experiments by treating humans as interactive reasoning tools rather than simple help sources.