#manipulation-tasks News & Analysis

14 articles tagged with #manipulation-tasks. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

14 articles

AIBullisharXiv – CS AI · Jun 57/10

🧠

LadderMan: Learning Humanoid Perceptive Ladder Climbing

Researchers have developed LadderMan, a humanoid robot system that learns to climb ladders and perform manipulation tasks using a two-stage learning pipeline combining imitation and reinforcement learning with vision foundation models. The system successfully transfers from simulation to real-world hardware without additional training, addressing one of the most challenging tasks in robotics due to sparse contact points and complex coordination requirements.

AIBullisharXiv – CS AI · May 127/10

🧠

RePO-VLA: Recovery-Driven Policy Optimization for Vision-Language-Action Models

Researchers introduce RePO-VLA, a policy optimization framework that improves Vision-Language-Action models' ability to recover from failures in complex manipulation tasks. The method increases adversarial robustness from 20% to 75% by learning from recovery trajectories rather than discarding failed attempts, with validation on both simulated and real-world robotic tasks.

AIBullisharXiv – CS AI · May 77/10

🧠

When Life Gives You BC, Make Q-functions: Extracting Q-values from Behavior Cloning for On-Robot Reinforcement Learning

Researchers introduce Q2RL, a novel algorithm that combines behavior cloning with reinforcement learning to enable robots to improve their policies through online interaction. The method uses Q-value estimation and gating mechanisms to prevent policy degradation from distribution mismatch, achieving 100% success rates on complex manipulation tasks in 1-2 hours of real robot learning.

AIBullisharXiv – CS AI · May 17/10

🧠

PRTS: A Primitive Reasoning and Tasking System via Contrastive Representations

Researchers introduce PRTS, a Vision-Language-Action foundation model that reformulates robotic learning through goal-conditioned reinforcement learning rather than traditional behavior cloning. The system learns to assess goal reachability by embedding state-action pairs and language instructions in a unified space, achieving state-of-the-art performance on multiple robotic benchmarks and real-world tasks.

AIBullisharXiv – CS AI · Apr 77/10

🧠

Build on Priors: Vision--Language--Guided Neuro-Symbolic Imitation Learning for Data-Efficient Real-World Robot Manipulation

Researchers have developed a neuro-symbolic framework that enables robots to learn complex manipulation tasks from as few as one demonstration, without requiring manual programming or large datasets. The system uses Vision-Language Models to automatically construct symbolic planning domains and has been validated on real industrial equipment including forklifts and robotic arms.

AIBearisharXiv – CS AI · Mar 167/10

🧠

Altered Thoughts, Altered Actions: Probing Chain-of-Thought Vulnerabilities in VLA Robotic Manipulation

Research reveals critical vulnerabilities in Vision-Language-Action robotic models that use chain-of-thought reasoning, where corrupting object names in internal reasoning traces can reduce task success rates by up to 45%. The study shows these AI systems are vulnerable to attacks on their internal reasoning processes, even when primary inputs remain untouched.

AINeutralarXiv – CS AI · Jun 86/10

🧠

ViVa: A Video-Generative Value Model for Robot Reinforcement Learning

Researchers introduce ViVa, a video-generative value model that enhances robot reinforcement learning by predicting future proprioception and scalar values simultaneously. The approach achieves 80% success rates in manipulation tasks by grounding value estimation in anticipated embodiment dynamics, addressing limitations in existing vision-language models for long-horizon robotics applications.

AINeutralarXiv – CS AI · Jun 26/10

🧠

RoboBenchMart: Benchmarking Robots in Retail Environment

Researchers introduced RoboBenchMart, an open-source simulated benchmark for evaluating robotic systems in retail dark-store environments. The study reveals that current state-of-the-art vision-language-action (VLA) models struggle with complex grocery manipulation tasks, indicating limitations in their generalization across diverse domains beyond tabletop scenarios.

AIBullisharXiv – CS AI · Jun 16/10

🧠

Mixture of Horizons in Action Chunking

Researchers propose Mixture of Horizons (MoH), a novel technique for vision-language-action models in robotics that processes action sequences at multiple time scales simultaneously to balance long-term planning with short-term precision. The method achieves state-of-the-art performance on robotic manipulation tasks, reaching 99% success rate on LIBERO benchmarks while enabling 2.5x faster inference through adaptive horizon selection.

AINeutralarXiv – CS AI · May 296/10

🧠

RoboWits: Unexpected Challenges for Robotic Creative Problem Solving

Researchers introduced RoboWits, a robotic benchmark that evaluates cognitive reasoning and creative problem-solving under unexpected conditions. The study reveals that current vision-language models struggle with manipulation tasks requiring adaptation and robustness, highlighting a significant gap between seed task performance and real-world generalization.

AINeutralarXiv – CS AI · May 96/10

🧠

AsyncVLA: Asynchronous Flow Matching for Vision-Language-Action Models

Researchers introduce AsyncVLA, a new framework for vision-language-action models that improves robotic task performance by using asynchronous flow matching instead of rigid time schedules. The system adds self-correction capabilities, allowing robots to refine uncertain actions before execution, demonstrating superior results in both simulation and real-world manipulation tasks.

AINeutralarXiv – CS AI · Mar 96/10

🧠

Restoring Linguistic Grounding in VLA Models via Train-Free Attention Recalibration

Researchers have identified a critical failure mode in Vision-Language-Action (VLA) robotic models called 'linguistic blindness,' where robots prioritize visual cues over language instructions when they contradict. They developed ICBench benchmark and proposed IGAR, a train-free solution that recalibrates attention to restore language instruction influence without requiring model retraining.

AIBullisharXiv – CS AI · Mar 26/1014

🧠

Actor-Critic for Continuous Action Chunks: A Reinforcement Learning Framework for Long-Horizon Robotic Manipulation with Sparse Reward

Researchers introduced AC3 (Actor-Critic for Continuous Chunks), a new reinforcement learning framework that addresses challenges in long-horizon robotic manipulation tasks with sparse rewards. The system uses continuous action chunks with stabilization mechanisms and achieved superior performance on 25 benchmark tasks using minimal demonstrations.

AIBullisharXiv – CS AI · Mar 35/105

🧠

Non-Markovian Long-Horizon Robot Manipulation via Keyframe Chaining

Researchers introduce Keyframe-Chaining VLA, a new AI framework that improves robot manipulation for long-horizon tasks by extracting and linking key historical frames to model temporal dependencies. The method addresses limitations in current Vision-Language-Action models that struggle with Non-Markovian dependencies where optimal actions depend on specific past states rather than current observations.