y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#performance-tuning News & Analysis

5 articles tagged with #performance-tuning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

5 articles
AIBullisharXiv – CS AI · Jun 27/10
🧠

AI-PROPELLER: Warehouse-Scale Interprocedural Code Layout Optimization with AlphaEvolve

AI-PROPELLER introduces the first warehouse-scale interprocedural code layout optimization system, using an evolutionary AI workflow to improve binary performance by 0.23-1.6% beyond existing post-link optimizers. This breakthrough applies machine learning to compiler optimization in industrial production environments, achieving measurable real-world performance gains.

AIBullisharXiv – CS AI · May 297/10
🧠

PassNet: Scaling Large Language Models for Graph Compiler Pass Generation

PassNet introduces the first large-scale ecosystem for using large language models to generate compiler passes—structured graph transformations that optimize tensor compiler performance. The framework includes 18K computational graphs and 200 curated benchmark tasks, revealing that while LLMs lag frontier models by 37% on average, they achieve up to 3x speedups on individual workloads, indicating consistency rather than capability is the limiting factor.

AIBullisharXiv – CS AI · Apr 107/10
🧠

AI-Driven Research for Databases

Researchers propose AI-Driven Research for Systems (ADRS), a framework using large language models to automate database optimization by generating and evaluating hundreds of candidate solutions. By co-evolving evaluators with solutions, the team demonstrates discovery of novel algorithms achieving up to 6.8x latency improvements over existing baselines in buffer management, query rewriting, and index selection tasks.

AINeutralHugging Face Blog · 2d ago6/10
🧠

Profiling in PyTorch (Part 2): From nn.Linear to a Fused MLP

This article demonstrates PyTorch profiling techniques for optimizing neural network performance, specifically comparing standard nn.Linear layers with fused MLP implementations. The work illustrates how developer-level optimization practices can significantly improve AI model efficiency, relevant to both open-source ML communities and production deployment scenarios.

AIBullisharXiv – CS AI · May 286/10
🧠

Learning When to Optimize: Verified Optimization Skills from Expert GPU-Kernel Lineages

Researchers introduce KLineage, a system that teaches LLM-based agents when to apply GPU kernel optimizations by learning from expert implementations through backward validation rather than forward trial-and-error. The approach extracts reusable optimization skills that encode not just what optimizations work, but the conditions and contexts where they're valid, demonstrating improved kernel quality over existing memory-based baselines.

🏢 Nvidia