Adversarial Instance Generation and Robust Training for Neural Combinatorial Optimization with Multiple Objectives
Researchers propose a framework for improving the robustness of deep reinforcement learning solvers for multi-objective combinatorial optimization problems by generating adversarial instances that expose weaknesses and training defenses using hardness-aware preference selection. The method demonstrates significant improvements in solver generalizability across traveling salesman, vehicle routing, and knapsack problems.
This research addresses a critical gap in the deployment of neural combinatorial optimization solvers: their vulnerability to adversarial perturbations and out-of-distribution problem instances. While deep reinforcement learning has shown promise for complex optimization tasks, practical applications require solvers that maintain performance across diverse scenarios and edge cases. The authors develop a two-pronged approach that both identifies failure modes and systematically hardens solvers against them.
The work builds on growing recognition that learning-based optimization requires robustness guarantees comparable to traditional algorithmic approaches. Neural solvers often overfit to training distributions, limiting their real-world applicability where problem characteristics vary unpredictably. Multi-objective scenarios introduce additional complexity, requiring solvers to navigate trade-offs across competing objectives while maintaining solution quality. The preference-conditioned approach allows flexible optimization across different Pareto-front regions, but this flexibility creates surface area for adversarial attacks.
The proposed adversarial attack methodology generates hard instances that expose solver weaknesses, quantified through Pareto-front degradation metrics. The defense strategy integrates these insights into training through hardness-aware preference sampling, preventing overfitting to restricted preference regions. Testing across three canonical problems—MOTSP, MOCVRP, and MOKP—demonstrates consistent improvements in robustness and generalization.
This research has implications for practitioners deploying neural optimization solvers in production environments. Improved robustness reduces risk of catastrophic performance degradation on unexpected problem distributions, enabling broader adoption. The framework provides a systematic methodology for testing solver reliability before deployment. Future work should explore computational costs of adversarial training and scalability to larger problem instances, critical factors for industrial adoption.
- →Adversarial training significantly improves neural solver robustness across out-of-distribution multi-objective combinatorial problems.
- →Hardness-aware preference selection during training prevents overfitting to restricted preference regions and improves generalization.
- →The framework successfully identifies hard problem instances that expose solver vulnerabilities across multiple optimization problems.
- →Neural combinatorial optimizers require systematic robustness evaluation before deployment in production environments.
- →Multi-objective solvers benefit from diverse preference sampling during training to maintain Pareto-front quality across conditions.