🧠 AI🟢 BullishImportance 6/10

Delta-Based Neural Architecture Search: LLM Fine-Tuning via Code Diffs

arXiv – CS AI|Santosh Premi Adhikari, Radu Timofte, Dmitry Ignatov|May 7, 2026 at 04:00 AM

🤖AI Summary

Researchers introduce Delta-Code Generation, a method where fine-tuned LLMs generate compact code diffs to modify existing neural architectures rather than creating complete models from scratch. The approach achieves significantly higher validity rates (66-75%) and accuracy (64-66%) compared to baseline full-generation methods while reducing output by 75-85%, demonstrating a more efficient paradigm for LLM-driven neural architecture search.

Analysis

Delta-Code Generation represents a meaningful shift in how large language models approach neural architecture search, addressing a fundamental inefficiency in current methods. Rather than generating entire model implementations, the approach trains LLMs to produce unified diffs that refine existing baseline architectures. This mirrors software development practices where developers work with code changes rather than complete rewrites, applying established engineering principles to machine learning.

The research stems from growing recognition that full-model synthesis by LLMs produces verbose, computationally expensive outputs. Prior work in neural architecture search relied on generating complete architectures from scratch, consuming significant tokens and GPU resources. This work builds on recent advances in efficient fine-tuning techniques like LoRA and dataset curation, leveraging the LEMUR dataset combined with MinHash-Jaccard filtering to ensure structural diversity in training examples.

The performance improvements are substantial. DeepSeek-Coder-7B achieved 75.3% validity rate compared to 50.6% for baseline methods, with first-epoch accuracy reaching 65.8% versus 42.3%. On CIFAR-10 specifically, the delta-based approach achieved 85.5% accuracy, surpassing both the full-generation baseline and concurrent research from Gu et al. The 75-85% reduction in output length directly translates to lower computational costs and faster inference.

The broader implication involves improving efficiency across AI development workflows. As organizations deploy LLMs for code generation tasks, token efficiency and code quality become economic factors. This research validates that architectural constraints and iterative refinement produce better results than unrestricted generation, potentially influencing how LLMs are applied to software engineering, model optimization, and automated ML pipeline development.

Key Takeaways

→Delta-based generation achieves 75.3% validity rate versus 50.6% baseline, demonstrating architectural refinement outperforms full-synthesis approaches
→Output length reduced by 75-85% (30-50 lines vs 200+), directly lowering computational costs and token consumption for LLM-driven architecture search
→First-epoch accuracy accurately predicts final rankings (Spearman ρ = 0.926), validating single-epoch evaluation as reliable proxy for full training cycles
→Method works consistently across three 7B-parameter models and six computer vision datasets, indicating domain-agnostic applicability
→Combines LoRA fine-tuning with structural diversity filtering to improve training data quality and model generalization