20 articles tagged with #planning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullisharXiv โ CS AI ยท 1d ago7/10
๐ง Researchers demonstrate that multi-token prediction (MTP) outperforms standard next-token prediction (NTP) for training language models on reasoning tasks like planning and pathfinding. Through theoretical analysis of simplified Transformers, they reveal that MTP enables a reverse reasoning process where models first identify end states then reconstruct paths backward, suggesting MTP induces more interpretable and robust reasoning circuits.
AINeutralarXiv โ CS AI ยท Mar 177/10
๐ง Researchers introduced CRASH, an LLM-based agent that analyzes autonomous vehicle incidents from NHTSA data covering 2,168 cases and 80+ million miles driven between 2021-2025. The system achieved 86% accuracy in fault attribution and found that 64% of incidents stem from perception or planning failures, with rear-end collisions comprising 50% of all reported incidents.
AIBullisharXiv โ CS AI ยท Mar 97/10
๐ง Researchers developed Localized In-Context Learning (L-ICL), a technique that significantly improves large language model performance on symbolic planning tasks by targeting specific constraint violations with minimal corrections. The method achieves 89% valid plan generation compared to 59% for best baselines, representing a major advancement in LLM reasoning capabilities.
AIBullisharXiv โ CS AI ยท Mar 57/10
๐ง Researchers developed CoCo-TAMP, a robot planning framework that uses large language models to improve state estimation in partially observable environments. The system leverages LLMs' common-sense reasoning to predict object locations and co-locations, achieving 62-73% reduction in planning time compared to baseline methods.
AIBullisharXiv โ CS AI ยท Mar 56/10
๐ง Researchers introduce PhysMem, a memory framework that enables vision-language model robot planners to learn physical principles through real-time interaction without updating model parameters. The system records experiences, generates hypotheses, and verifies them before application, achieving 76% success on brick insertion tasks compared to 23% for direct experience retrieval.
AIBullisharXiv โ CS AI ยท Mar 37/103
๐ง Researchers have developed MagicAgent, a series of foundation models designed for generalized AI agent planning that outperforms existing sub-100B models and even surpasses leading ultra-scale models like GPT-5.2. The models achieve superior performance through a novel synthetic data framework and two-stage training paradigm that addresses gradient interference in multi-task learning.
AIBullisharXiv โ CS AI ยท Feb 277/106
๐ง Researchers propose a new sparse imagination technique for visual world model planning that significantly reduces computational burden while maintaining task performance. The method uses transformers with randomized grouped attention to enable efficient planning in resource-constrained environments like robotics.
AIBullishGoogle DeepMind Blog ยท May 207/106
๐ง Google is expanding Gemini AI to become a universal world model capable of making plans and simulating new experiences. This represents a significant advancement toward building comprehensive AI assistants that can understand and interact with complex real-world scenarios.
AINeutralarXiv โ CS AI ยท 2d ago6/10
๐ง Researchers evaluated whether large language models can function as text-only controllers for navigation and exploration in unknown environments under partial observability. Testing nine contemporary LLMs on ASCII gridworld tasks, they found reasoning-tuned models reliably complete navigation goals but remain inefficient compared to optimal paths, with few-shot prompting reducing invalid moves and improving path efficiency.
AINeutralarXiv โ CS AI ยท Mar 266/10
๐ง Researchers propose DUPLEX, a dual-system architecture that restricts LLMs to information extraction rather than end-to-end planning, using symbolic planners for logical synthesis. The system demonstrated superior performance across 12 planning domains by leveraging LLMs for semantic grounding while avoiding their hallucination tendencies in complex reasoning tasks.
AIBullisharXiv โ CS AI ยท Mar 176/10
๐ง Researchers introduce Imagine-then-Plan (ITP), a new AI framework that enables agents to learn through adaptive lookahead imagination using world models. The system allows AI agents to simulate multi-step future scenarios and adjust planning horizons dynamically, significantly outperforming existing methods in benchmark tests.
AINeutralarXiv โ CS AI ยท Mar 27/1012
๐ง Researchers propose a new theoretical framework for AI planning under changing conditions using causal POMDPs (Partially Observable Markov Decision Processes). The framework represents environmental changes as interventions, enabling AI systems to evaluate and adapt plans when underlying conditions shift while maintaining computational tractability.
AIBullisharXiv โ CS AI ยท Mar 27/1016
๐ง Researchers introduce PseudoAct, a new framework that uses pseudocode synthesis to improve large language model agent planning and action control. The method achieves significant performance improvements over existing reactive approaches, with a 20.93% absolute gain in success rate on FEVER benchmark and new state-of-the-art results on HotpotQA.
AIBullisharXiv โ CS AI ยท Feb 276/107
๐ง Researchers propose a new approach to generalized planning that learns explicit transition models rather than directly predicting action sequences. This method achieves better out-of-distribution performance with fewer training instances and smaller models compared to Transformer-based planners like PlanGPT.
AIBullishLil'Log (Lilian Weng) ยท Jun 236/10
๐ง The article explores LLM-powered autonomous agents that use large language models as core controllers, going beyond text generation to serve as general problem solvers. Key systems like AutoGPT, GPT-Engineer, and BabyAGI demonstrate the potential of agents with planning, memory, and tool-use capabilities.
AINeutralarXiv โ CS AI ยท Mar 94/10
๐ง Researchers developed PyPDDLEngine, an open-source tool that allows large language models to perform task planning through interactive PDDL simulation. Testing on 102 planning problems showed agentic LLM planning achieved 66.7% success versus 63.7% for direct LLM planning, but at 5.7x higher token cost, while classical planning methods reached 85.3% success.
๐ง Claude๐ง Haiku
AINeutralarXiv โ CS AI ยท Mar 54/10
๐ง Researchers propose a new approach to world models that combines explicit simulators with learned models using the DEVS formalism. The method uses LLMs to generate discrete-event world models from natural language specifications, targeting environments with event-driven dynamics like queueing systems and multi-agent coordination.
AINeutralApple Machine Learning ยท Feb 234/103
๐ง Apple is hosting the Workshop on Reasoning and Planning 2025, focusing on advancing AI systems' reasoning capabilities. The workshop brings together Apple researchers and external members to explore new techniques and understand current limitations in AI reasoning and planning.
AINeutralOpenAI News ยท Nov 54/107
๐ง The article discusses a model-based control approach for efficient learning and exploration that combines online planning with offline learning. This methodology aims to optimize the balance between computational efficiency and learning effectiveness in AI systems.
AINeutralarXiv โ CS AI ยท Mar 24/105
๐ง Researchers have released TaCarla, a comprehensive dataset containing over 2.85 million frames from CARLA simulation environment designed for end-to-end autonomous driving research. The dataset addresses limitations in existing autonomous driving datasets by providing both perception and planning data with diverse behavioral scenarios for comprehensive model training and evaluation.
$RNDR