y0news
← Feed
←Back to feed
🧠 AI🟒 Bullish

D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

arXiv – CS AI|Suhwan Choi, Jaeyoon Jung, Haebin Seong, Minchan Kim, Minyeong Kim, Yongjun Cho, Yoonshik Kim, Yubeen Park, Youngjae Yu, Yunsung Lee||1 views
πŸ€–AI Summary

Researchers developed D2E (Desktop to Embodied AI), a framework that uses desktop gaming data to pretrain AI models for robotics tasks. Their 1B-parameter model achieved 96.6% success on manipulation tasks and 83.3% on navigation, matching performance of models up to 7 times larger while using scalable desktop data instead of expensive physical robot training data.

Key Takeaways
  • β†’D2E framework successfully transfers learning from desktop gaming environments to real-world robotics tasks.
  • β†’The approach uses 1.3K+ hours of desktop data with 152x compression through the OWA Toolkit.
  • β†’A 1B-parameter model matches performance of much larger models like OpenVLA (7B parameters).
  • β†’The framework includes public datasets and tools, unlike previous proprietary approaches.
  • β†’Desktop pretraining offers a cost-effective alternative to expensive physical robot data collection.
Read Original β†’via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains β€” you keep full control of your keys.
Connect Wallet to AI β†’How it works
Related Articles