y0news
AnalyticsDigestsSourcesRSSAICrypto
#wasserstein-kernel1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท Feb 276/106
๐Ÿง 

Unbiased Sliced Wasserstein Kernels for High-Quality Audio Captioning

Researchers developed an unbiased sliced Wasserstein RBF kernel with rotary positional embedding to improve audio captioning systems by addressing exposure bias and temporal relationship issues. The method shows significant improvements in caption quality and text-to-audio retrieval accuracy on AudioCaps and Clotho datasets, while also enhancing audio reasoning capabilities in large language models.