AIBearisharXiv – CS AI · 6h ago7/10
🧠
Sparse Tokens Suffice: Jailbreaking Audio Language Models via Token-Aware Gradient Optimization
Researchers demonstrate that audio language models can be jailbroken using sparse token optimization rather than dense waveform updates, with Token-Aware Gradient Optimization (TAGO) achieving comparable attack success rates while modifying only 25% of audio tokens. The findings reveal that gradient energy concentrates in specific audio regions, suggesting future AI safety research should account for this heterogeneous token-level structure.