AIBullisharXiv – CS AI · 15h ago7/10
🧠
Max-Window Scale Estimation for Near-Lossless HiF8 W8A8 Quantization-Aware Training
Researchers develop a systematic approach to quantization-aware training for large language models using 8-bit floating-point formats, identifying and solving two critical failure modes—amax saturation and catastrophic forgetting—that don't surface in standard training metrics. Their solution achieves near-lossless performance with only 0.43% degradation on benchmark tasks, advancing practical LLM deployment efficiency.