#optimization-algorithms News & Analysis

30 articles tagged with #optimization-algorithms. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

30 articles

AIBullisharXiv – CS AI · Jun 237/10

🧠

Delay-Adaptive Speculation Control for Low-Latency Edge-Cloud LLM Inference

Researchers develop a delay-adaptive algorithm for optimizing speculative decoding in distributed LLM inference across edge-cloud systems. The study proves optimal draft length follows a finite threshold policy and introduces UCB-SpecStop, an online control algorithm that reduces per-token latency by up to 22.4% compared to existing methods while adapting to varying network conditions.

🧠 Llama

AIBullisharXiv – CS AI · Jun 47/10

🧠

Model-Preserving Adaptive Rounding

Researchers introduce YAQA, a new quantization algorithm that improves model compression by directly optimizing end-to-end error rather than layer-by-layer error. The method achieves 30% error reduction compared to existing approaches like GPTQ and even outperforms quantization-aware training, with theoretical guarantees backing its performance.

AIBullisharXiv – CS AI · Apr 207/10

🧠

StoSignSGD: Unbiased Structural Stochasticity Fixes SignSGD for Training Large Language Models

Researchers introduce StoSignSGD, a novel optimization algorithm that fixes convergence issues in SignSGD by injecting structural stochasticity while maintaining unbiased updates. The algorithm demonstrates 1.44x to 2.14x speedup in low-precision FP8 LLM pretraining where AdamW fails, and outperforms existing optimizers in mathematical reasoning fine-tuning tasks.

AIBullisharXiv – CS AI · Apr 107/10

🧠

Space Filling Curves is All You Need: Communication-Avoiding Matrix Multiplication Made Simple

Researchers present a new approach to General Matrix Multiplication (GEMM) using Space Filling Curves that automatically optimizes data movement across memory hierarchies without requiring platform-specific tuning. The method achieves up to 5.5x speedups over vendor libraries and demonstrates significant performance gains in LLM inference and distributed computing applications.

AINeutralarXiv – CS AI · Jun 256/10

🧠

ASAP: Agent-System Co-Design for Wall-Clock-Centered Auto HPO Research for ML Experiments

Researchers introduce ASAP, an agent-system co-design that leverages LLMs to coordinate multiple hyperparameter optimization tools while reducing wall-clock execution time through architectural innovations like KV-cache reuse and speculation parallelism. The approach addresses fundamental limitations in current LLM-based HPO methods by treating the language model as an orchestrator rather than a replacement tool, demonstrating consistent performance improvements across diverse ML tasks.

AINeutralarXiv – CS AI · Jun 256/10

🧠

Distribution Preference Optimization: A Fine-grained Perspective for LLM Unlearning

Researchers introduce DiPO (Distribution Preference Optimization), a novel algorithm for LLM unlearning that operates at the token distribution level rather than full response level. The method addresses limitations in existing approaches like NPO by constructing preference signals through selective amplification of model logits, achieving superior performance on benchmark tests while maintaining model utility.

AINeutralarXiv – CS AI · Jun 235/10

🧠

Constituency Optimisation Through Hamiltonian Representation Of Mandates (COTHROM): Algorithmic Redistricting of Irish Election Boundaries

Researchers have developed COTHROM, the first computational framework for optimizing Irish electoral redistricting using statistical physics and machine learning algorithms. The system balances multiple constitutional objectives—such as proportional representation and geographic compactness—by treating them as variables in a Hamiltonian function, demonstrating improvements over existing legal boundaries in County Cork.

AINeutralarXiv – CS AI · Jun 236/10

🧠

Generative Robust Optimisation

Researchers introduce Generative Robust Optimisation (GRO), a framework using deep generative models to define uncertainty sets for optimization problems that better capture real-world data complexity than traditional geometric approaches. The method combines neural network decoders with a five-point evaluation framework and demonstrates practical applicability through production planning and facility location studies.

AIBullisharXiv – CS AI · Jun 196/10

🧠

Oranits: Mission Assignment and Task Offloading in Open RAN-based ITS using Metaheuristic and Deep Reinforcement Learning

Researchers introduce Oranits, a system for optimizing mission assignment and task offloading in Open RAN-based autonomous vehicle networks using metaheuristic algorithms and deep reinforcement learning. The proposed MA-DDQN framework achieves 11% improvement in mission completions and 12.5% improvement in overall benefit compared to baseline methods, advancing edge computing efficiency in intelligent transportation systems.

AINeutralarXiv – CS AI · Jun 106/10

🧠

The Whale That Outswam Evolution: Swarm Intelligence Maximises Memory in Connectome Reservoirs

Researchers applied four bio-inspired optimization algorithms to connectome-based neural networks across six animal species, demonstrating that gradient-free optimization can enhance biological neural structures by up to 17x on memory capacity tasks. The findings show that biological weight values, refined through evolution, serve as critical initial conditions that topology alone cannot replicate, establishing a principled approach for improving connectome-based reservoir computing systems.

AIBullisharXiv – CS AI · Jun 106/10

🧠

Importance-Aware Scheduling for High-Dimensional Hyperparameter Optimization

Researchers propose Greedy Importance First (GIF), a novel hyperparameter optimization strategy that uses importance-based scheduling to improve efficiency in high-dimensional ML/DL model training. The method outperforms established optimizers like TPE and BOHB on high-dimensional benchmarks by focusing computational resources on the most impactful hyperparameters.

AINeutralarXiv – CS AI · Jun 106/10

🧠

FOGO: Forgetting-aware Orthogonalization Optimizer

Researchers introduce FOGO, a new optimizer that addresses gradient interference during neural network training by orthogonalizing momentum updates and storing past directions in compressed memory. The method shows improvements over Adam and Muon across diverse tasks including continual learning, class-imbalanced classification, and large language model training.

AIBullisharXiv – CS AI · Jun 106/10

🧠

Unifying Local Communications and Local Updates for LLM Pretraining

Researchers introduce GASLoC, a decentralized pre-training algorithm that reduces communication overhead in distributed LLM training by enabling local optimizer steps and sparse peer communication instead of synchronous operations. The method demonstrates competitive or superior performance compared to existing approaches, particularly in heterogeneous bandwidth environments where worker speeds vary significantly.

AIBullisharXiv – CS AI · Jun 96/10

🧠

IDEQ -- Improving Diffusion Models for the Traveling Salesman Problem (TSP) by Leveraging the Structure of the Solution Space

Researchers introduce IDEQ, an improved diffusion model approach for solving the Traveling Salesman Problem that achieves state-of-the-art results for neural network-based methods, matching or exceeding traditional heuristics like LKH3 on benchmark instances while maintaining better scalability.

AINeutralarXiv – CS AI · Jun 86/10

🧠

Coordinated optimization of departure sequencing and section-track allocation in railway short-term concentrated departure scenarios based on qubo and hybrid quantum algorithms

Researchers developed a QUBO-based optimization framework combined with hybrid quantum algorithms to improve railway departure scheduling during peak periods. Testing shows quantum-enhanced methods reduced operational costs by 4-26% and delays by 4-24% compared to conventional approaches, though real-world validation remains pending.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Multi-ResNets for Subspace Preconditioning in Constrained Optimization

Researchers propose MResOpt, a staged residual neural network architecture that solves constrained optimization problems by decomposing constraint satisfaction hierarchically. The method demonstrates improved performance on convex and non-convex optimization benchmarks, with particular applications to power flow problems in electrical grids.

AINeutralarXiv – CS AI · Jun 45/10

🧠

Constraint-Enhanced Physical Search through Correlation Matching

Researchers propose a constraint-enhanced physical search principle demonstrating that exploration efficiency improves by matching temporal correlations in exploration patterns to spatial correlations generated by physical constraints, rather than maximizing randomness or anti-correlation.

AINeutralarXiv – CS AI · Jun 45/10

🧠

Multi-Column RBF Neural Network Using Adaptive and Non-Adaptive Particle Swarm Optimization

Researchers propose MC-PSO and MC-APSO, novel parallel neural network architectures that combine multi-column radial basis function networks with particle swarm optimization algorithms. These methods outperform existing approaches in accuracy, recall, and computational efficiency on benchmark datasets by distributing training across spatial subsets.

AIBullisharXiv – CS AI · Jun 26/10

🧠

Application of Algorithms in Energy-Efficient Design Platforms for Green Building

Researchers developed an integrated algorithmic platform combining Building Information Modeling, sensor data, and multi-objective optimization to design energy-efficient buildings. Testing on a mid-rise office building achieved a 29.3% reduction in annual energy consumption while limiting lifecycle cost increases to 3.7%, demonstrating practical scalability for green building design.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Stochastic convergence of parallel asynchronous adaptive first-order methods

Researchers introduce a new class of asynchronous adaptive first-order optimization methods that improve upon existing algorithms through momentum and inexact normalization variants. The methods achieve O(1/√t) convergence rates in stochastic non-convex settings and demonstrate practical relevance for large-scale heterogeneous machine learning systems.

AINeutralarXiv – CS AI · Jun 26/10

🧠

FOAM: Frequency and Operator Error-Based Adaptive Damping Method for Reducing Staleness-Oriented Error for Shampoo

Researchers propose FOAM, an adaptive algorithm that addresses the computational bottleneck in Shampoo optimization by dynamically controlling damping factors and eigendecomposition frequency to mitigate errors from stale preconditioner updates. The method reduces wall-clock training time while maintaining convergence stability, offering a practical solution to the efficiency-fidelity trade-off in large-scale machine learning optimization.

AINeutralarXiv – CS AI · May 295/10

🧠

Selection Hyper-heuristics Can Automatically Adjust the Learning Period to Optimally Solve Pseudo-Boolean Problems

Researchers demonstrate how selection hyper-heuristics can automatically adjust learning periods to optimize pseudo-Boolean problem solving, eliminating manual parameter tuning. The Random Gradient hyper-heuristic achieves optimal neighbourhood size selection in nearly all iterations while maintaining theoretically optimal performance on the LeadingOnes benchmark.

AINeutralarXiv – CS AI · May 296/10

🧠

Turning Stale Gradients into Stable Gradients: Coherent Coordinate Descent with Implicit Landscape Smoothing for Lightweight Zeroth-Order Optimization

Researchers propose Coherent Coordinate Descent (CoCD), a deterministic zeroth-order optimization method that improves sample efficiency for scenarios where backpropagation is unavailable. The approach reframes stale gradients as computational assets and demonstrates that larger finite-difference step sizes create implicit landscape smoothing, achieving superior convergence stability compared to existing randomized methods across neural network architectures.

AINeutralarXiv – CS AI · May 296/10

🧠

Theoretical Analysis of Sparse Optimization with Reparameterization, Weight Decay, and Adaptive Learning Rate

Researchers introduce ReWA, a novel sparse optimization method combining reparameterization, weight decay, and adaptive learning rates to address instability issues in ℓp regularization. Experiments on CIFAR-10 and ImageNet demonstrate that ReWA achieves superior sparsity compared to ℓ1 regularization while maintaining test accuracy, offering a practical alternative for neural network compression.

AINeutralarXiv – CS AI · May 286/10

🧠

HEART: Achieving Timely Multi-Model Training for Vehicle-Edge-Cloud-Integrated Hierarchical Federated Learning

Researchers introduce HEART, a novel framework for efficient multi-model federated learning across vehicle-edge-cloud architectures that addresses training latency and resource allocation challenges in IoV systems. The solution combines hybrid synchronous-asynchronous aggregation with optimized task scheduling using particle swarm optimization and genetic algorithms.

Page 1 of 2Next →