AIBullisharXiv – CS AI · Apr 157/10
🧠AutoSurrogate is an LLM-driven framework that automates the construction of deep learning surrogate models for subsurface flow simulation, enabling domain scientists without machine learning expertise to build high-quality models through natural language instructions. The system autonomously handles data profiling, architecture selection, hyperparameter optimization, and quality assessment while managing failure modes, demonstrating superior performance to expert-designed baselines on geological carbon storage tasks.
AINeutralarXiv – CS AI · 1d ago6/10
🧠Researchers propose a new evaluation framework for certified neural network training methods using Pareto front comparisons to assess the natural-certified accuracy trade-off. By applying automated hyperparameter optimization across methods, they reveal significant undertuning in prior work and establish new performance benchmarks that challenge assumptions about state-of-the-art certified robustness.
🏢 Meta
AINeutralarXiv – CS AI · 1d ago6/10
🧠Researchers propose Self-Adaptive Monotonic Normalization (SAMN), a hyperparameter-friendly approach to improve long-tailed recognition in deep learning. The method eliminates the need for manual parameter tuning while achieving state-of-the-art performance by enforcing monotonic constraints on per-class weight norms during classifier retraining.
AINeutralarXiv – CS AI · 1d ago6/10
🧠Researchers propose c-TPE, an enhanced Bayesian optimization method that extends the Tree-structured Parzen Estimator to handle inequality constraints in hyperparameter optimization. The method addresses practical real-world limitations like memory and latency constraints while maintaining strong performance, demonstrating superiority over existing approaches across 81 expensive optimization problems.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers demonstrate that weight decay during language model pretraining significantly improves model plasticity—the ability to adapt to downstream tasks through fine-tuning. The study reveals counterintuitive findings where higher weight decay produces weaker base models but stronger performance after task-specific training, challenging conventional approaches to hyperparameter optimization.
AINeutralarXiv – CS AI · 5d ago6/10
🧠Researchers introduce RAISE, a comprehensive framework for optimizing retrieval-augmented generation (RAG) systems by treating architecture design as a hyperparameter search problem. The study evaluates 13 optimization algorithms across seven datasets, revealing that RAG performance is highly task-dependent and no single optimization strategy universally outperforms others, highlighting the need for systematic rather than heuristic-based configuration approaches.
🏢 Meta
AINeutralarXiv – CS AI · 6d ago5/10
🧠Researchers demonstrate that recombination-based operators in Cartesian Genetic Programming can achieve competitive performance when combined with proper hyperparameter optimization, challenging the long-held assumption that mutation-only approaches are superior for symbolic regression tasks.
AINeutralarXiv – CS AI · Apr 146/10
🧠Researchers demonstrate that small-scale proxy models commonly used by AI companies to evaluate data curation strategies produce unreliable conclusions because optimal training configurations are data-dependent. They propose using reduced learning rates in proxy model training as a simple, cost-effective solution that better predicts full-scale model performance across diverse data recipes.
🏢 Meta
AINeutralarXiv – CS AI · Apr 136/10
🧠Researchers systematically evaluated how sampling temperature and prompting strategies affect extended reasoning performance in large language models, finding that zero-shot prompting peaks at moderate temperatures (T=0.4-0.7) while chain-of-thought performs better at extremes. The study reveals that extended reasoning benefits grow substantially with higher temperatures, suggesting that T=0 is suboptimal for reasoning tasks.
🧠 Grok
AIBullisharXiv – CS AI · Mar 36/103
🧠Researchers introduce Hyperparameter Trajectory Inference (HTI), a method to predict how neural networks behave with different hyperparameter settings without expensive retraining. The approach uses conditional Lagrangian optimal transport to create surrogate models that approximate neural network outputs across various hyperparameter configurations.
AINeutralarXiv – CS AI · Mar 115/10
🧠Researchers introduce the Overfitting-Underfitting Indicator (OUI) to analyze learning rate sensitivity in PPO reinforcement learning systems. The metric can identify problematic learning rates early in training by measuring neural activation patterns, enabling more efficient hyperparameter screening without full training runs.
AINeutralHugging Face Blog · Nov 24/106
🧠The article discusses hyperparameter optimization techniques for transformer models using Ray Tune, a distributed hyperparameter tuning library. This approach enables efficient scaling of machine learning model training and optimization across multiple computing resources.