AIBullisharXiv – CS AI · 10h ago6/10
🧠
Metal-Sci: A Scientific Compute Benchmark for Evolutionary LLM Kernel Search on Apple Silicon
Researchers introduce Metal-Sci, a benchmark suite for optimizing machine learning kernels on Apple Silicon using evolutionary LLM-driven search. The system demonstrates speedups ranging from 1.0x to 10.7x across scientific computing tasks while introducing a held-out validation mechanism that catches silent regressions in generalization, revealing critical flaws that in-distribution metrics alone cannot detect.
🧠 GPT-5🧠 Claude🧠 Opus