y0news
← Feed
Back to feed
🧠 AI NeutralImportance 6/10

Towards Reliable Audio Deepfake Attribution and Model Recognition: A Multi-Level Autoencoder-Based Framework

arXiv – CS AI|Andrea Di Pierno (IMT School of Advanced Studies), Luca Guarnera (University of Catania), Dario Allegra (University of Catania), Sebastiano Battiato (University of Catania)|
🤖AI Summary

Researchers introduce LAVA, a hierarchical framework using convolutional autoencoders to detect audio deepfakes and identify their source generation models with 95%+ accuracy. The system addresses a critical gap in deepfake attribution, moving beyond detection to pinpoint which specific AI model created fraudulent audio content.

Analysis

Audio deepfakes represent an emerging threat to digital trust and identity verification systems, yet most research focuses on detection rather than attribution—identifying which technology or model generated the fake. LAVA addresses this critical gap by introducing a two-tier approach: Audio Deepfake Attribution (ADA) identifies the generation technology used, while Audio Deepfake Model Recognition (ADMR) pinpoints the specific model instance. The framework achieves exceptional performance with F1-scores exceeding 95% on multiple benchmarks and demonstrates robustness under open-set conditions where unseen attack types are encountered.

The technical approach trains a convolutional autoencoder exclusively on fake audio, extracting attention-enhanced latent representations that serve as input for specialized classifiers. This design choice is significant because it sidesteps the challenge of acquiring representative real-world audio while maintaining strong generalization capabilities. Testing across ASVspoof2021, FakeOrReal, CodecFake, and unseen ASVspoof2019 LA attacks validates performance across diverse scenarios and attack vectors.

For the broader security ecosystem, reliable deepfake attribution enables forensic investigations, source accountability, and targeted model remediation. Organizations relying on voice authentication—banking systems, government agencies, emergency services—gain tools to verify audio authenticity and trace manipulation sources. The public release of code and models democratizes access to attribution technology, accelerating adoption across security applications.

Looking ahead, the field must address real-time processing requirements and integration with production voice systems. Emerging questions include robustness against adversarial attacks specifically designed to fool attribution systems and scalability as deepfake generation technologies continue evolving.

Key Takeaways
  • LAVA achieves 95%+ F1-scores for identifying deepfake generation technology and 96.31% accuracy for recognizing specific model instances
  • The framework operates under open-set conditions, maintaining robustness when encountering previously unseen attack types
  • Attribution-focused research advances beyond simple detection to enable forensic accountability and model-specific remediation
  • Public code release accelerates ecosystem adoption for audio deepfake forensics in voice authentication systems
  • Confidence-based rejection thresholds improve reliability in production deployment scenarios with real-world noise and variations
Mentioned Tokens
$ADA$0.1521-5.0%
Let AI manage these →
Non-custodial · Your keys, always
Read Original →via arXiv – CS AI
Act on this with AI
This article mentions $ADA.
Let your AI agent check your portfolio, get quotes, and propose trades — you review and approve from your device.
Connect Wallet to AI →How it works
Related Articles