AINeutralarXiv – CS AI · 9h ago6/10
🧠
Do Joint Audio-Video Generation Models Understand Physics?
Researchers introduced AV-Phys Bench, a benchmark testing whether joint audio-video generation models truly understand physics or merely generate plausible outputs. Testing seven models across three scene categories, the study found all systems lack robust physical understanding, with performance collapsing on deliberately inconsistent prompts and transition-heavy scenarios.