AIBullisharXiv – CS AI · Apr 157/10
🧠AutoSurrogate is an LLM-driven framework that automates the construction of deep learning surrogate models for subsurface flow simulation, enabling domain scientists without machine learning expertise to build high-quality models through natural language instructions. The system autonomously handles data profiling, architecture selection, hyperparameter optimization, and quality assessment while managing failure modes, demonstrating superior performance to expert-designed baselines on geological carbon storage tasks.
AINeutralarXiv – CS AI · Apr 147/10
🧠Researchers demonstrate that integrating fairness metrics directly into AutoML optimization improves algorithmic fairness by 14.5% while reducing data usage by 35.7%, though at the cost of a 9.4% decrease in predictive accuracy. This study challenges the industry standard of prioritizing performance over fairness and shows that simpler, fairer ML models can achieve practical balance without requiring complex architectures.
🏢 Meta
AINeutralarXiv – CS AI · Mar 57/10
🧠Researchers present N2M-RSI, a formal model showing that AI systems feeding their own outputs back as inputs can experience unbounded complexity growth once crossing an information-integration threshold. The framework applies to both individual AI agents and swarms of communicating agents, with implementation details withheld for safety reasons.
AINeutralarXiv – CS AI · 3d ago6/10
🧠EARLY (Evolutionary Algorithm for Reservoir Learning and Yielding) introduces an automated method for optimizing Echo State Networks by evolving both topology and hyperparameters using evolutionary algorithms. The framework demonstrates that evolved architectures outperform random search baselines and adapt their complexity based on task difficulty, suggesting potential for creating reusable neural network structures across diverse temporal learning problems.
AINeutralarXiv – CS AI · 6d ago6/10
🧠Researchers introduce RAISE, a comprehensive framework for optimizing retrieval-augmented generation (RAG) systems by treating architecture design as a hyperparameter search problem. The study evaluates 13 optimization algorithms across seven datasets, revealing that RAG performance is highly task-dependent and no single optimization strategy universally outperforms others, highlighting the need for systematic rather than heuristic-based configuration approaches.
🏢 Meta
AINeutralarXiv – CS AI · May 126/10
🧠Researchers have released LLMSYS-HPOBench, the first comprehensive benchmark suite for hyperparameter optimization in real-world LLM systems, containing 364,450 configurations across 932 settings with multiple fidelity factors and cost metrics. The dataset addresses gaps in existing AutoML benchmarks by capturing the unprecedented complexity of optimizing both AI and non-AI components in production language model systems.
AINeutralarXiv – CS AI · May 16/10
🧠Researchers introduce FairMind, an automated tool that detects fairness bias in machine learning datasets using causal analysis and LLM-generated reports. The software applies the standard fairness model to evaluate how protected variables influence predictions through counterfactual reasoning, addressing a critical gap in existing AutoML frameworks that typically ignore fairness considerations.
AIBullisharXiv – CS AI · Apr 106/10
🧠Researchers propose PS-PFN, an advanced AutoML method that extends traditional algorithm selection and hyperparameter optimization to handle modern ML pipelines with fine-tuning and ensembling. Using posterior sampling and prior-data fitted networks for in-context learning, the approach outperforms existing bandit and AutoML strategies on benchmark tasks.
AIBullisharXiv – CS AI · Feb 276/106
🧠Researchers propose an Evaluation Agent framework to assess AI agent decision-making in AutoML pipelines, moving beyond outcome-focused metrics to evaluate intermediate decisions. The system can detect faulty decisions with 91.9% F1 score and reveals impacts ranging from -4.9% to +8.3% in final performance metrics.