AINeutralarXiv – CS AI · 4h ago6/10
🧠
End-to-End Voice Intent Recognition for Spontaneous Human-Drone Interaction with Naive Users
Researchers have developed an end-to-end voice recognition system for drone control that processes spontaneous, natural speech from untrained users with 82% accuracy and minimal latency. The system uses self-supervised learning combined with cross-modal knowledge distillation, eliminating the need for manual transcription and significantly outperforming traditional cascade approaches in both speed and accuracy.