Multimodal Group Emotion Recognition In-the-Wild Towards a Privacy-Safe Non-Individual Approach
Researchers propose privacy-preserving group emotion recognition (GER) systems using multimodal audio-video analysis instead of individual biometric data. Two novel architectures—a cross-attention fusion model and a Variational Encoder Multi-Decoder framework—demonstrate that competitive emotion inference is achievable at the collective level without monitoring individual faces, voices, or gazes.
