y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#kernel-fusion News & Analysis

2 articles tagged with #kernel-fusion. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles
AIBullisharXiv โ€“ CS AI ยท Apr 77/10
๐Ÿง 

Diagonal-Tiled Mixed-Precision Attention for Efficient Low-Bit MXFP Inference

Researchers have developed a new low-bit mixed-precision attention kernel called Diagonal-Tiled Mixed-Precision Attention (DMA) that significantly speeds up large language model inference on NVIDIA B200 GPUs while maintaining generation quality. The technique uses microscaling floating-point (MXFP) data format and kernel fusion to address the high computational costs of transformer-based models.

๐Ÿข Nvidia
AIBullisharXiv โ€“ CS AI ยท Mar 127/10
๐Ÿง 

RedFuser: An Automatic Operator Fusion Framework for Cascaded Reductions on AI Accelerators

RedFuser is a new automated framework that optimizes AI model deployment by fusing cascaded reduction operations into single loops, achieving 2-5x performance improvements. The system addresses limitations in existing AI compilers that struggle with complex multi-loop operations like those found in attention mechanisms.