AIBearisharXiv โ CS AI ยท Mar 177/10
๐ง
$\tau$-Voice: Benchmarking Full-Duplex Voice Agents on Real-World Domains
Researchers introduce ฯ-voice, a new benchmark for evaluating full-duplex voice AI agents on complex real-world tasks. The study reveals significant performance gaps, with voice agents achieving only 30-45% of text-based AI capability under realistic conditions with noise and diverse accents.
๐ง GPT-5