AIBullisharXiv โ CS AI ยท 10h ago6/10
๐ง
HiFloat4 Format for Language Model Pre-training on Ascend NPUs
Researchers demonstrate that HiFloat4, a 4-bit floating-point format, enables efficient large language model training on Huawei's Ascend NPUs with up to 4x improvements in compute throughput and memory efficiency. The study shows that specialized stabilization techniques can maintain accuracy within 1% of full-precision baselines while preserving computational gains across dense and mixture-of-experts architectures.