y0news
AnalyticsDigestsSourcesRSSAICrypto
#captioning1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 10h ago4/10
๐Ÿง 

LAMB: LLM-based Audio Captioning with Modality Gap Bridging via Cauchy-Schwarz Divergence

Researchers have developed LAMB, a new AI framework that improves automated audio captioning by better aligning audio features with large language models through Cauchy-Schwarz divergence optimization. The system achieved state-of-the-art performance on AudioCaps dataset by bridging the modality gap between audio and text embeddings.