βBack to feed
π§ AIπ’ Bullish
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI
arXiv β CS AI|Suhwan Choi, Jaeyoon Jung, Haebin Seong, Minchan Kim, Minyeong Kim, Yongjun Cho, Yoonshik Kim, Yubeen Park, Youngjae Yu, Yunsung Lee||1 views
π€AI Summary
Researchers developed D2E (Desktop to Embodied AI), a framework that uses desktop gaming data to pretrain AI models for robotics tasks. Their 1B-parameter model achieved 96.6% success on manipulation tasks and 83.3% on navigation, matching performance of models up to 7 times larger while using scalable desktop data instead of expensive physical robot training data.
Key Takeaways
- βD2E framework successfully transfers learning from desktop gaming environments to real-world robotics tasks.
- βThe approach uses 1.3K+ hours of desktop data with 152x compression through the OWA Toolkit.
- βA 1B-parameter model matches performance of much larger models like OpenVLA (7B parameters).
- βThe framework includes public datasets and tools, unlike previous proprietary approaches.
- βDesktop pretraining offers a cost-effective alternative to expensive physical robot data collection.
#embodied-ai#robotics#pretraining#desktop-data#transfer-learning#gaming-ai#machine-learning#computer-vision#open-source
Read Original βvia arXiv β CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains β you keep full control of your keys.
Related Articles