#planning News & Analysis

26 articles tagged with #planning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

26 articles

AINeutralarXiv – CS AI · Jun 127/10

🧠

A Tutorial on World Models and Physical AI

A new arXiv tutorial presents a unified framework for world modeling in artificial intelligence, distinguishing between explicit models used for planning and implicit models embedded in learned representations. The paper highlights how world models enable physical AI systems in robotics and autonomous driving while identifying key challenges in hierarchical reasoning and long-horizon planning that remain critical for advancing toward artificial general intelligence.

AIBullisharXiv – CS AI · Jun 57/10

🧠

PLAN-S: Bridging Planning with Latent Style Dynamics for Autonomous Driving World Models

Researchers introduce PLAN-S, a new neural architecture that improves autonomous driving by creating interpretable cost maps from latent world models, enabling better control over driving style dynamics. The method demonstrates significant safety improvements on benchmark datasets, reducing collision rates by 42% on nuScenes while maintaining frozen backbone models.

AIBullisharXiv – CS AI · Apr 157/10

🧠

How Transformers Learn to Plan via Multi-Token Prediction

Researchers demonstrate that multi-token prediction (MTP) outperforms standard next-token prediction (NTP) for training language models on reasoning tasks like planning and pathfinding. Through theoretical analysis of simplified Transformers, they reveal that MTP enables a reverse reasoning process where models first identify end states then reconstruct paths backward, suggesting MTP induces more interpretable and robust reasoning circuits.

AINeutralarXiv – CS AI · Mar 177/10

🧠

CRASH: Cognitive Reasoning Agent for Safety Hazards in Autonomous Driving

Researchers introduced CRASH, an LLM-based agent that analyzes autonomous vehicle incidents from NHTSA data covering 2,168 cases and 80+ million miles driven between 2021-2025. The system achieved 86% accuracy in fault attribution and found that 64% of incidents stem from perception or planning failures, with rear-end collisions comprising 50% of all reported incidents.

AIBullisharXiv – CS AI · Mar 97/10

🧠

Localizing and Correcting Errors for LLM-based Planners

Researchers developed Localized In-Context Learning (L-ICL), a technique that significantly improves large language model performance on symbolic planning tasks by targeting specific constraint violations with minimal corrections. The method achieves 89% valid plan generation compared to 59% for best baselines, representing a major advancement in LLM reasoning capabilities.

AIBullisharXiv – CS AI · Mar 57/10

🧠

Large-Language-Model-Guided State Estimation for Partially Observable Task and Motion Planning

Researchers developed CoCo-TAMP, a robot planning framework that uses large language models to improve state estimation in partially observable environments. The system leverages LLMs' common-sense reasoning to predict object locations and co-locations, achieving 62-73% reduction in planning time compared to baseline methods.

AIBullisharXiv – CS AI · Mar 56/10

🧠

Learning Physical Principles from Interaction: Self-Evolving Planning via Test-Time Memory

Researchers introduce PhysMem, a memory framework that enables vision-language model robot planners to learn physical principles through real-time interaction without updating model parameters. The system records experiences, generates hypotheses, and verifies them before application, achieving 76% success on brick insertion tasks compared to 23% for direct experience retrieval.

AIBullisharXiv – CS AI · Mar 37/103

🧠

MagicAgent: Towards Generalized Agent Planning

Researchers have developed MagicAgent, a series of foundation models designed for generalized AI agent planning that outperforms existing sub-100B models and even surpasses leading ultra-scale models like GPT-5.2. The models achieve superior performance through a novel synthetic data framework and two-stage training paradigm that addresses gradient interference in multi-task learning.

AIBullisharXiv – CS AI · Feb 277/106

🧠

Sparse Imagination for Efficient Visual World Model Planning

Researchers propose a new sparse imagination technique for visual world model planning that significantly reduces computational burden while maintaining task performance. The method uses transformers with randomized grouped attention to enable efficient planning in resource-constrained environments like robotics.

AIBullishGoogle DeepMind Blog · May 207/106

🧠

Our vision for building a universal AI assistant

Google is expanding Gemini AI to become a universal world model capable of making plans and simulating new experiences. This represents a significant advancement toward building comprehensive AI assistants that can understand and interact with complex real-world scenarios.

AINeutralarXiv – CS AI · Jun 236/10

🧠

REBA: A Revealed Belief Automaton Framework for Online Planning in Continuous POMDPs

Researchers introduce REBA (Revealed Belief Automaton), a new framework for online planning in continuous partially observable environments that dynamically certifies belief states rather than relying on predefined discrete abstractions. The method achieves 17-47% performance improvements over existing approaches in patrolling and navigation tasks by combining information-theoretic analysis with formal symbolic planning.

AINeutralarXiv – CS AI · Jun 236/10

🧠

PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems

Researchers introduced PlanBench-XL, a benchmark testing how LLM agents plan and execute tasks across 1,665 tools in realistic scenarios. The study reveals significant vulnerabilities in current AI systems, with performance dropping from 51.9% to 11.36% accuracy when tools fail or behave unexpectedly, exposing critical gaps in adaptive planning capabilities.

🧠 GPT-5

AINeutralarXiv – CS AI · May 116/10

🧠

Predictive but Not Plannable: RC-aux for Latent World Models

Researchers present RC-aux, a lightweight auxiliary objective that improves latent world models for planning by addressing the spatiotemporal mismatch between short-horizon prediction training and long-horizon planning deployment. The method adds multi-horizon prediction and budget-conditioned reachability supervision to align learned representations with planning requirements, demonstrating improvements on goal-conditioned control tasks.

AINeutralarXiv – CS AI · May 116/10

🧠

Where's the Plan? Locating Latent Planning in Language Models with Lightweight Mechanistic Interventions

Researchers investigated how language models develop internal representations of future constraints during text generation using rhyming-couplet completion as a test case. Across three major model families (Qwen, Gemma, Llama), only Gemma-3-27B demonstrated causal reliance on future-planning representations, with a critical handoff point at layer 30 localized to five attention heads.

🧠 Llama

AINeutralarXiv – CS AI · Apr 146/10

🧠

LLMs for Text-Based Exploration and Navigation Under Partial Observability

Researchers evaluated whether large language models can function as text-only controllers for navigation and exploration in unknown environments under partial observability. Testing nine contemporary LLMs on ASCII gridworld tasks, they found reasoning-tuned models reliably complete navigation goals but remain inefficient compared to optimal paths, with few-shot prompting reducing invalid moves and improving path efficiency.

AINeutralarXiv – CS AI · Mar 266/10

🧠

DUPLEX: Agentic Dual-System Planning via LLM-Driven Information Extraction

Researchers propose DUPLEX, a dual-system architecture that restricts LLMs to information extraction rather than end-to-end planning, using symbolic planners for logical synthesis. The system demonstrated superior performance across 12 planning domains by leveraging LLMs for semantic grounding while avoiding their hallucination tendencies in complex reasoning tasks.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Imagine-then-Plan: Agent Learning from Adaptive Lookahead with World Models

Researchers introduce Imagine-then-Plan (ITP), a new AI framework that enables agents to learn through adaptive lookahead imagination using world models. The system allows AI agents to simulate multi-step future scenarios and adjust planning horizons dynamically, significantly outperforming existing methods in benchmark tests.

AINeutralarXiv – CS AI · Mar 27/1012

🧠

Planning under Distribution Shifts with Causal POMDPs

Researchers propose a new theoretical framework for AI planning under changing conditions using causal POMDPs (Partially Observable Markov Decision Processes). The framework represents environmental changes as interventions, enabling AI systems to evaluate and adapt plans when underlying conditions shift while maintaining computational tractability.

AIBullisharXiv – CS AI · Mar 27/1016

🧠

PseudoAct: Leveraging Pseudocode Synthesis for Flexible Planning and Action Control in Large Language Model Agents

Researchers introduce PseudoAct, a new framework that uses pseudocode synthesis to improve large language model agent planning and action control. The method achieves significant performance improvements over existing reactive approaches, with a 20.93% absolute gain in success rate on FEVER benchmark and new state-of-the-art results on HotpotQA.

AIBullisharXiv – CS AI · Feb 276/107

🧠

On Sample-Efficient Generalized Planning via Learned Transition Models

Researchers propose a new approach to generalized planning that learns explicit transition models rather than directly predicting action sequences. This method achieves better out-of-distribution performance with fewer training instances and smaller models compared to Transformer-based planners like PlanGPT.

AIBullishLil'Log (Lilian Weng) · Jun 236/10

🧠

LLM Powered Autonomous Agents

The article explores LLM-powered autonomous agents that use large language models as core controllers, going beyond text generation to serve as general problem solvers. Key systems like AutoGPT, GPT-Engineer, and BabyAGI demonstrate the potential of agents with planning, memory, and tool-use capabilities.

AINeutralarXiv – CS AI · Mar 94/10

🧠

Agentic LLM Planning via Step-Wise PDDL Simulation: An Empirical Characterisation

Researchers developed PyPDDLEngine, an open-source tool that allows large language models to perform task planning through interactive PDDL simulation. Testing on 102 planning problems showed agentic LLM planning achieved 66.7% success versus 63.7% for direct LLM planning, but at 5.7x higher token cost, while classical planning methods reached 85.3% success.

🧠 Claude🧠 Haiku

AINeutralarXiv – CS AI · Mar 54/10

🧠

Specification-Driven Generation and Evaluation of Discrete-Event World Models via the DEVS Formalism

Researchers propose a new approach to world models that combines explicit simulators with learned models using the DEVS formalism. The method uses LLMs to generate discrete-event world models from natural language specifications, targeting environments with event-driven dynamics like queueing systems and multi-agent coordination.

AINeutralApple Machine Learning · Feb 234/103

🧠

Apple Workshop on Reasoning and Planning 2025

Apple is hosting the Workshop on Reasoning and Planning 2025, focusing on advancing AI systems' reasoning capabilities. The workshop brings together Apple researchers and external members to explore new techniques and understand current limitations in AI reasoning and planning.

AINeutralOpenAI News · Nov 54/107

🧠

Plan online, learn offline: Efficient learning and exploration via model-based control

The article discusses a model-based control approach for efficient learning and exploration that combines online planning with offline learning. This methodology aims to optimize the balance between computational efficiency and learning effectiveness in AI systems.

Page 1 of 2Next →