AIBullisharXiv – CS AI · Mar 96/10
🧠Researchers propose Hybrid Hierarchical RL (H²RL), a new framework that combines symbolic logic with deep reinforcement learning to address misalignment issues in AI agents. The method uses logical option-based pretraining to improve long-horizon decision-making and prevent agents from over-exploiting short-term rewards.
AIBullisharXiv – CS AI · Mar 66/10
🧠Researchers propose STRUCTUREDAGENT, a new AI framework that uses hierarchical planning with AND/OR trees to improve web agent performance on complex, long-horizon tasks. The system addresses limitations in current LLM-based agents through better memory tracking and structured planning approaches.
AIBullisharXiv – CS AI · Mar 36/106
🧠Researchers developed TARSE, a new AI system for clinical decision-making that retrieves relevant medical skills and experiences from curated libraries to improve reasoning accuracy. The system performs test-time adaptation to align language models with clinically valid logic, showing improvements over existing medical AI baselines in question-answering benchmarks.
AINeutralarXiv – CS AI · Mar 36/103
🧠A research study evaluated six state-of-the-art large language models in geopolitical crisis simulations, comparing their decision-making to human behavior. The study found that LLMs initially mirror human decisions but diverge over time, consistently exhibiting cooperative, stability-focused strategies with limited adversarial reasoning.
AINeutralarXiv – CS AI · Mar 35/103
🧠Researchers developed behavioral generative agents powered by large language models to simulate consumer decision-making in energy operations. The study found these AI agents can model heterogeneous customer behavior and provide insights into rare events like blackouts, offering a scalable tool for energy policy analysis.
AIBullisharXiv – CS AI · Mar 27/1016
🧠Researchers introduce PseudoAct, a new framework that uses pseudocode synthesis to improve large language model agent planning and action control. The method achieves significant performance improvements over existing reactive approaches, with a 20.93% absolute gain in success rate on FEVER benchmark and new state-of-the-art results on HotpotQA.
AIBullisharXiv – CS AI · Feb 276/106
🧠Researchers propose an Evaluation Agent framework to assess AI agent decision-making in AutoML pipelines, moving beyond outcome-focused metrics to evaluate intermediate decisions. The system can detect faulty decisions with 91.9% F1 score and reveals impacts ranging from -4.9% to +8.3% in final performance metrics.
AIBearisharXiv – CS AI · Feb 276/107
🧠Researchers developed ClinDet-Bench, a new benchmark that reveals large language models fail to properly identify when they have sufficient information to make clinical decisions. The study shows LLMs make both premature judgments and excessive abstentions in medical scenarios, highlighting safety concerns for AI deployment in healthcare settings.
AINeutralarXiv – CS AI · Feb 276/105
🧠Researchers identified stochasticity (variability) as a critical barrier to deploying Deep Research Agents in real-world applications like financial decision-making and medical analysis. The study proposes mitigation strategies that reduce output variance by 22% while maintaining research quality, addressing a key obstacle for enterprise AI agent adoption.
CryptoNeutralEthereum Foundation Blog · Aug 215/104
⛓️The article introduces futarchy as a governance mechanism that could be implemented through DAOs to improve decision-making processes. It explores how decentralized autonomous organizations enable rapid experimentation with social coordination mechanisms that traditional institutions struggle to adapt quickly.
GeneralNeutralCrypto Briefing · May 125/10
📰Simone Stolzoff discusses how embracing uncertainty improves decision-making, with personal values serving as anchors during chaotic periods. The article highlights that technology, while powerful, can undermine our natural coping mechanisms, using Slack's business pivot as an example of how uncertainty can drive innovation.
AI × CryptoNeutralThe Block · Apr 205/10
🤖Coinbase is developing AI agents modeled after former executives Fred Ehrsam and Balaji Srinivasan to provide high-level strategic feedback to staff. CEO Brian Armstrong announced the initiative, signaling the exchange's investment in AI-driven operational tools that leverage the decision-making patterns of influential crypto industry figures.
AINeutralMarkTechPost · Mar 105/10
🧠This tutorial demonstrates building an advanced AI agent system that incorporates risk-awareness through internal criticism, self-consistency reasoning, and uncertainty estimation. The system evaluates responses across multiple dimensions including accuracy, coherence, and safety while implementing risk-sensitive selection strategies for more reliable decision-making.
AINeutralThe Register – AI · Mar 54/10
🧠The article title suggests that high-level executives are increasingly delegating important business decisions to artificial intelligence systems. However, no article body was provided for detailed analysis.
AINeutralarXiv – CS AI · Mar 25/105
🧠Researchers introduced VAF, a systematic evaluation pipeline to measure how visual web elements influence AI agent decision-making. The study tested 48 variants across 5 real-world websites and found that background contrast, item size, position, and card clarity significantly impact agent behavior, while font styling and text color have minimal effects.
AINeutralarXiv – CS AI · Mar 25/106
🧠Researchers have introduced fEDM+, an enhanced fuzzy ethical decision-making framework for AI systems that provides principle-level explainability and validates decisions against multiple stakeholder perspectives. The framework extends the original fEDM by adding transparent explanations of ethical decisions and replacing single-point validation with pluralistic validation that accommodates different ethical viewpoints.
AINeutralarXiv – CS AI · Mar 34/105
🧠Researchers propose MO-MIX, a new deep reinforcement learning approach that addresses multi-objective multi-agent cooperative decision-making problems. The method combines centralized training with decentralized execution and demonstrates superior performance over baseline methods while requiring less computational cost.
AINeutralarXiv – CS AI · Mar 34/104
🧠Researchers present a multi-agent Large Language Model framework for interactive AI planning systems that provides context-dependent explanations to human planners. The system aims to facilitate collaborative decision-making between humans and AI rather than replacing human planners entirely.
AINeutralarXiv – CS AI · Mar 24/106
🧠Researchers have developed ArgLLM-App, a web-based system that uses Large Language Models for argumentative reasoning in decision-making tasks. The system allows human users to visualize explanations and contest reasoning mistakes, making AI decisions more transparent and contestable.
AINeutralarXiv – CS AI · Mar 24/106
🧠Researchers introduce resilient strategies for stochastic systems, focusing on decision-making that remains robust against disturbances that could flip agent decisions. The work presents fundamental problems for Markov decision processes with reachability and safety objectives, extending to stochastic games with various disturbance aggregation methods.
AINeutralarXiv – CS AI · Feb 273/106
🧠Researchers developed a machine learning method to predict professional tennis players' first serve directions, achieving 49% accuracy for male players and 44% for female players. The study provides evidence that top players use mixed-strategy serving decisions and suggests contextual information plays a larger role in tennis strategy than previously understood.