AINeutralarXiv – CS AI · 14h ago5/10
🧠
TaxDistill: Improving Metagenomic Taxonomic Annotation via Distilled Genomic Foundation Models
TaxDistill introduces a knowledge distillation framework using GenomeOcean, a 500M-parameter genomic foundation model, to improve metagenomic taxonomic annotation by reducing label noise from sequence similarity tools. The approach achieves significant performance gains, improving F1 scores by 23.3% on gastrointestinal datasets compared to traditional methods.