#nvfp4 News & Analysis

2 articles tagged with #nvfp4. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles

AIBullisharXiv – CS AI · Jun 97/10

🧠

ScaleSweep: Accurate NVFP4 Post-Training Quantization of LLMs via Block Scale Initialization

ScaleSweep introduces an optimized block scale initialization method for NVFP4 quantization of large language models, improving upon traditional AbsMax approaches. The technique theoretically bounds the search space and empirically achieves 93% performance retention under aggressive 4-bit quantization, advancing hardware-efficient AI inference.

🧠 Llama

AIBullisharXiv – CS AI · Jun 57/10

🧠

Beyond Output Matching: Preserving Internal Geometry in NVFP4 LLM Distillatio

Researchers propose CKA-QAD, a new method for quantizing large language models to NVFP4 precision that preserves internal representational geometry rather than just matching output distributions. The approach addresses a critical limitation in existing quantization-aware distillation techniques, showing significant improvements in reasoning and coding task performance across multiple model architectures.