y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#bert-variants News & Analysis

1 article tagged with #bert-variants. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AINeutralarXiv – CS AI · 6h ago6/10
🧠

IHUBERT: Vector-Based Semantic Deduplication and Domain-Balanced Pretraining for Persian Resources

Researchers have developed IHUBERT, a new Persian language model with 125 million parameters trained on a curated 45GB corpus using advanced semantic deduplication techniques. The model achieves state-of-the-art results on multiple Persian NLP benchmarks, particularly excelling in extractive question answering tasks, while addressing the long-standing scarcity of high-quality Persian pretraining resources.