AINeutralarXiv – CS AI · 10h ago5/10
🧠
How Well Do Self-Supervised Speech Models Encode Age and Gender in Children's Speech? A Layer-Wise Analysis Across Multiple Architectures
Researchers conducted a comprehensive layer-wise analysis of how four major self-supervised learning (SSL) speech models encode age and gender information in children's speech. The study reveals that age and gender cues are unevenly distributed across model layers, with early-to-mid layers capturing the strongest paralinguistic signals, and demonstrates reliable classification accuracy even from 1-3 second audio segments.