#sequential-decision-making News & Analysis

9 articles tagged with #sequential-decision-making. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

9 articles

AIBullisharXiv – CS AI · Jun 57/10

🧠

ABBEL: Learning Natural-Language Belief States for Memory-Efficient Interaction

ABBEL is a new recursive summarization framework that enables AI agents to maintain memory-efficient interaction histories by storing information as natural-language belief states rather than full context. The approach uses reinforcement learning techniques to improve belief generation quality, achieving 40% better performance than prior memory-constrained agents while using 67% less memory.

AINeutralarXiv – CS AI · May 117/10

🧠

Agentick: A Unified Benchmark for General Sequential Decision-Making Agents

Researchers introduce Agentick, a unified benchmark for evaluating diverse AI agents—from reinforcement learning to large language models—across 37 procedurally generated tasks. Testing 27 configurations reveals no single approach dominates, with GPT-4 mini leading overall while specialized methods excel in specific domains, suggesting significant optimization potential across all agent paradigms.

🏢 Meta🧠 GPT-5

AINeutralarXiv – CS AI · Jun 105/10

🧠

SCOPE: Sequential Causal Optimization of Process Interventions

Researchers introduce SCOPE, a new machine learning approach for Prescriptive Process Monitoring that optimizes sequential business interventions using causal inference rather than simulation-based reinforcement learning. The method addresses a critical gap in existing systems by accounting for how multiple interventions interact over time while working directly with observational data, demonstrated through testing on synthetic and semi-synthetic datasets.

AINeutralarXiv – CS AI · Jun 26/10

🧠

MINTS: Minimalist Thompson Sampling

Researchers introduce MINTS (Minimalist Thompson Sampling), a Bayesian framework that simplifies sequential decision-making under uncertainty by placing priors only on optimal parameters while eliminating unnecessary variables through profile likelihood. The approach achieves near-optimal regret bounds for multi-armed bandits and automatically adapts to structural constraints, matching classical performance benchmarks.

AINeutralarXiv – CS AI · May 126/10

🧠

Attribution-based Explanations for Markov Decision Processes

Researchers have developed attribution techniques that explain decision-making in Markov Decision Processes (MDPs), extending explainability methods beyond static inputs to sequential decision-making systems. The approach assigns importance scores to states and execution paths, enabling more interpretable AI agents in dynamic environments.

AINeutralarXiv – CS AI · May 126/10

🧠

Large Language Models for Sequential Decision-Making: Improving In-Context Learning via Supervised Fine-Tuning

Researchers demonstrate that large language models can be effectively fine-tuned to perform sequential decision-making tasks across MDPs, POMDPs, and ambiguous environments by learning from offline trajectory data. The approach achieves stronger performance than baseline methods, particularly in complex, partially-observed scenarios, with theoretical analysis showing the fine-tuned attention mechanisms implicitly estimate optimal Q-functions.

AIBullisharXiv – CS AI · May 96/10

🧠

PRISM: Perception Reasoning Interleaved for Sequential Decision Making

PRISM is a new AI framework that improves embodied agents by coupling Vision-Language Models with Large Language Models through dynamic question-answer interactions, addressing the perception-reasoning gap in multimodal AI systems. The framework demonstrates significant performance improvements on benchmark tasks like ALFWorld and R2R, showing that interactive, goal-oriented perception yields superior understanding compared to standalone visual analysis.

AINeutralarXiv – CS AI · Apr 146/10

🧠

LLMs for Text-Based Exploration and Navigation Under Partial Observability

Researchers evaluated whether large language models can function as text-only controllers for navigation and exploration in unknown environments under partial observability. Testing nine contemporary LLMs on ASCII gridworld tasks, they found reasoning-tuned models reliably complete navigation goals but remain inefficient compared to optimal paths, with few-shot prompting reducing invalid moves and improving path efficiency.

AIBullisharXiv – CS AI · Mar 166/10

🧠

AI Planning Framework for LLM-Based Web Agents

Researchers introduce a formal planning framework that maps LLM-based web agents to traditional search algorithms, enabling better diagnosis of failures in autonomous web tasks. The study compares different agent architectures using novel evaluation metrics and a dataset of 794 human-labeled trajectories from WebArena benchmark.