#imitation-learning News & Analysis

24 articles tagged with #imitation-learning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

24 articles

AIBullisharXiv – CS AI · May 127/10

🧠

RePO-VLA: Recovery-Driven Policy Optimization for Vision-Language-Action Models

Researchers introduce RePO-VLA, a policy optimization framework that improves Vision-Language-Action models' ability to recover from failures in complex manipulation tasks. The method increases adversarial robustness from 20% to 75% by learning from recovery trajectories rather than discarding failed attempts, with validation on both simulated and real-world robotic tasks.

AIBullisharXiv – CS AI · Apr 77/10

🧠

Build on Priors: Vision--Language--Guided Neuro-Symbolic Imitation Learning for Data-Efficient Real-World Robot Manipulation

Researchers have developed a neuro-symbolic framework that enables robots to learn complex manipulation tasks from as few as one demonstration, without requiring manual programming or large datasets. The system uses Vision-Language Models to automatically construct symbolic planning domains and has been validated on real industrial equipment including forklifts and robotic arms.

AIBullisharXiv – CS AI · Mar 177/10

🧠

Fine-tuning is Not Enough: A Parallel Framework for Collaborative Imitation and Reinforcement Learning in End-to-end Autonomous Driving

Researchers propose PaIR-Drive, a new parallel framework that combines imitation learning and reinforcement learning for autonomous driving, achieving 91.2 PDMS performance on NAVSIMv1 benchmark. The approach addresses limitations of sequential fine-tuning by running IL and RL in parallel branches, enabling better performance than existing methods.

AIBullisharXiv – CS AI · Mar 177/10

🧠

What Matters for Scalable and Robust Learning in End-to-End Driving Planners?

Researchers introduce BevAD, a new lightweight end-to-end autonomous driving architecture that achieves 72.7% success rate on the Bench2Drive benchmark. The study systematically analyzes architectural patterns in closed-loop driving performance, revealing limitations of open-loop dataset approaches and demonstrating strong data-scaling behavior through pure imitation learning.

AIBullisharXiv – CS AI · Mar 167/10

🧠

Guided Policy Optimization under Partial Observability

Researchers introduce Guided Policy Optimization (GPO), a new reinforcement learning framework that addresses challenges in partially observable environments by co-training a guider with privileged information and a learner through imitation learning. The method demonstrates theoretical optimality comparable to direct RL and shows strong empirical performance across various tasks including continuous control and memory-based challenges.

AIBullisharXiv – CS AI · Mar 56/10

🧠

IROSA: Interactive Robot Skill Adaptation using Natural Language

Researchers present IROSA, a framework combining foundation models with imitation learning for robot skill adaptation using natural language commands. The system uses a tool-based architecture that maintains safety by creating an abstraction layer between language models and robot hardware, demonstrated on industrial bearing ring insertion tasks.

AIBullisharXiv – CS AI · Mar 47/103

🧠

Learning Object-Centric Spatial Reasoning for Sequential Manipulation in Cluttered Environments

Researchers developed Unveiler, a robotic manipulation framework that uses object-centric spatial reasoning to retrieve items from cluttered environments. The system achieves up to 97.6% success in simulation by separating high-level spatial reasoning from low-level action execution, and demonstrates zero-shot transfer to real-world scenarios.

AIBullisharXiv – CS AI · Mar 47/102

🧠

Tether: Autonomous Functional Play with Correspondence-Driven Trajectory Warping

Researchers introduce Tether, a breakthrough method enabling robots to perform autonomous functional play using minimal human demonstrations (≤10). The system generates over 1000 expert-level trajectories through continuous cycles of task execution and improvement, representing a significant advance in autonomous robotics learning.

AIBullisharXiv – CS AI · Mar 46/102

🧠

How to Peel with a Knife: Aligning Fine-Grained Manipulation with Human Preference

Researchers developed a two-stage learning framework enabling robots to perform complex manipulation tasks like food peeling with over 90% success rates. The system combines force-aware imitation learning with human preference-based refinement, achieving strong generalization across different produce types using only 50-200 training examples.

AIBullisharXiv – CS AI · Mar 37/103

🧠

Model Predictive Adversarial Imitation Learning for Planning from Observation

Researchers have developed a new approach called Model Predictive Adversarial Imitation Learning that combines inverse reinforcement learning with model predictive control to enable AI agents to learn from incomplete human demonstrations. The method shows significant improvements in sample efficiency, generalization, and robustness compared to traditional imitation learning approaches.

AIBullisharXiv – CS AI · Feb 277/106

🧠

On Discovering Algorithms for Adversarial Imitation Learning

Researchers have developed DAIL (Discovered Adversarial Imitation Learning), the first meta-learned AI algorithm that uses LLM-guided evolutionary methods to automatically discover reward assignment functions for training AI agents. This breakthrough addresses stability issues in adversarial imitation learning and demonstrates superior performance compared to human-designed approaches across different environments.

AINeutralarXiv – CS AI · Feb 277/107

🧠

Learning to Answer from Correct Demonstrations

Researchers propose a new approach for training AI models to generate correct answers from demonstrations, using imitation learning in contextual bandits rather than traditional supervised fine-tuning. The method achieves better sample complexity and works with weaker assumptions about the underlying reward model compared to existing likelihood-maximization approaches.

AINeutralarXiv – CS AI · May 126/10

🧠

Zero-shot Imitation Learning by Latent Topology Mapping

Researchers introduce ZALT, an imitation learning method that enables AI agents to solve unseen tasks by identifying latent hub states in demonstrated trajectories and planning over abstract topology. The approach achieves 55% zero-shot success on complex maze tasks compared to 6% for existing baselines, addressing the challenge of adapting learned behaviors to new long-horizon goals without additional training.

AINeutralarXiv – CS AI · May 115/10

🧠

Drifting Field Policy: A One-Step Generative Policy via Wasserstein Gradient Flow

Researchers introduce Drifting Field Policy (DFP), a one-step generative policy that uses Wasserstein gradient flow to optimize reinforcement learning without ODE-based approaches. DFP demonstrates state-of-the-art performance on robotic manipulation tasks, suggesting a potential shift in how generative models are applied to control problems.

AINeutralarXiv – CS AI · May 116/10

🧠

TAVIS: A Benchmark for Egocentric Active Vision and Anticipatory Gaze in Imitation Learning

Researchers introduced TAVIS, a comprehensive benchmark for evaluating active vision in imitation learning systems where robotic policies control their own gaze during manipulation tasks. The benchmark includes evaluation protocols, a novel metric (GALT) measuring anticipatory gaze, and baseline experiments showing that active vision benefits are task-dependent rather than universally beneficial.

🏢 Hugging Face

AIBullisharXiv – CS AI · Mar 96/10

🧠

PRISM: Personalized Refinement of Imitation Skills for Manipulation via Human Instructions

PRISM is a new AI method that combines imitation learning and reinforcement learning to train robotic manipulation systems using human instructions and feedback. The approach allows generic robotic policies to be refined for specific tasks through natural language descriptions and human corrections, improving performance in pick-and-place tasks while reducing computational requirements.

AIBullisharXiv – CS AI · Mar 35/104

🧠

Reference Grounded Skill Discovery

Researchers developed Reference-Grounded Skill Discovery (RGSD), a new AI algorithm that enables high-dimensional agents to learn complex skills by grounding discovery in semantically meaningful reference data. The method successfully taught a simulated humanoid with 359-dimensional observations to imitate and vary behaviors like walking, running, and punching while outperforming traditional imitation learning approaches.

AIBullisharXiv – CS AI · Mar 27/1014

🧠

Less is More: Lean yet Powerful Vision-Language Model for Autonomous Driving

Researchers introduce Max-V1, a novel vision-language model framework that treats autonomous driving as a language problem, predicting trajectories from camera input. The model achieved over 30% performance improvement on the nuScenes dataset and demonstrates strong cross-vehicle adaptability.

AIBullisharXiv – CS AI · Feb 276/105

🧠

Risk-Aware World Model Predictive Control for Generalizable End-to-End Autonomous Driving

Researchers developed Risk-aware World Model Predictive Control (RaWMPC), a new framework for autonomous driving that makes safe decisions without relying on expert demonstrations. The system uses a world model to predict consequences of multiple actions and selects low-risk options through explicit risk evaluation, showing superior performance in both normal and rare driving scenarios.

AINeutralarXiv – CS AI · Apr 155/10

🧠

Hybrid-AIRL: Enhancing Inverse Reinforcement Learning with Supervised Expert Guidance

Researchers introduce Hybrid-AIRL, an enhanced inverse reinforcement learning framework that combines adversarial learning with supervised expert guidance to improve reward function inference in complex, imperfect-information environments like poker. The method demonstrates superior sample efficiency and learning stability compared to traditional AIRL, particularly in settings with sparse and delayed rewards.

AINeutralMicrosoft Research Blog · Feb 54/102

🧠

Rethinking imitation learning with Predictive Inverse Dynamics Models

Microsoft Research explores Predictive Inverse Dynamics Models (PIDMs) in imitation learning, showing they outperform standard Behavior Cloning by using predictions to reduce ambiguity. The approach enables more efficient learning from fewer demonstrations compared to traditional methods.

AINeutralarXiv – CS AI · Mar 24/105

🧠

Beyond State-Wise Mirror Descent: Offline Policy Optimization with Parameteric Policies

Researchers present theoretical advances in offline reinforcement learning that extend beyond current limitations to work with parameterized policies over large or continuous action spaces. The work connects mirror descent to natural policy gradient methods and reveals a surprising unification between offline RL and imitation learning.

AINeutralOpenAI News · Mar 63/105

🧠

Third-person imitation learning

The article title references third-person imitation learning, a machine learning technique where AI systems learn by observing interactions between other agents rather than direct demonstration. However, no article body content was provided for analysis.

AINeutralOpenAI News · Mar 211/107

🧠

One-shot imitation learning

The article title references one-shot imitation learning, a machine learning technique where AI systems learn to perform tasks from observing just a single demonstration. However, the article body appears to be empty, providing no substantive content to analyze.