AIBullisharXiv – CS AI · Apr 207/10
🧠Researchers propose a bilevel optimization framework using Monte Carlo Tree Search to systematically improve LLM agent skills—structured collections of instructions, tools, and resources. The framework optimizes both skill structure and component content simultaneously, demonstrating performance improvements on Operations Research tasks and addressing a previously unsolved challenge in agent design optimization.
AIBullisharXiv – CS AI · Apr 147/10
🧠A comprehensive tutorial examines how deep learning complements operations research and optimization for sequential decision-making under uncertainty. The framework positions AI not as a replacement for traditional optimization but as an enhancement, with applications across supply chains, healthcare, energy, and autonomous systems.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers propose a hybrid deep reinforcement learning algorithm (A3C DPPO) to optimize inventory replenishment in pharmaceutical supply chains, addressing challenges of unpredictable demand, variable lead times, and product shelf-life constraints. The approach demonstrates cost reductions compared to benchmark methods while maintaining service levels, with validation using real-world pharmaceutical data.
AIBullisharXiv – CS AI · 3d ago6/10
🧠Researchers propose constraint injection, a novel verification technique that detects missing or spurious constraints in LLM-generated optimization code. VRPCoder, an 8B model fine-tuned with this method, achieves 93% accuracy on vehicle routing problems, significantly outperforming GPT and Claude models on constraint-dense combinatorial optimization tasks.
🧠 Claude🧠 Gemini
AINeutralarXiv – CS AI · 5d ago6/10
🧠Researchers propose a post-solve robustness framework for Mixed-Integer Linear Programming decision engines, addressing the gap between theoretical optimal solutions and real-world deployment where parameter perturbations can invalidate feasibility. The work calls for standardized auditing of solved problems to measure how solutions perform under small cost, demand, and resource variations.
AINeutralarXiv – CS AI · May 296/10
🧠Researchers introduce Opt-Verifier, an LLM-based framework that improves automated mathematical optimization modeling by verifying generated models from both structural and solution perspectives. The dual-side verification approach addresses a critical gap in existing systems by validating constraints, variables, and solution validity, achieving over 20% accuracy improvements on benchmark tests.
AIBullisharXiv – CS AI · May 296/10
🧠OptSkills, a new AI system, advances automated optimization problem-solving by clustering problems by underlying mathematical archetypes rather than surface narratives, achieving 68.27% accuracy on diverse benchmarks and outperforming DeepSeek-V3.2-Thinking on large-scale problems. The system uses skill distillation and trajectory learning to improve generalization across both known and novel problem types.
AINeutralarXiv – CS AI · May 286/10
🧠Researchers introduce OR-Space, a comprehensive benchmark for evaluating large language model agents in industrial operations research workflows. Unlike existing benchmarks that focus on single-stage problem translation, OR-Space tests agents across persistent multi-artifact workspaces with three task modes—building optimization models, revising them under changing requirements, and explaining solutions—to assess real-world reliability and practical readiness.
AINeutralarXiv – CS AI · May 285/10
🧠Researchers have developed an enhanced Large Neighborhood Search (LNS) algorithm to solve a variant of the capacitated facility location problem that incorporates customer incompatibilities, where certain customer pairs cannot share the same facility. The new method employs hybrid destroy operators and exact solvers, achieving superior performance over existing metaheuristics on all benchmark instances.
AIBullisharXiv – CS AI · May 286/10
🧠Researchers present an LLM-powered framework that enables non-expert end users to re-optimize deployed decision-support systems through natural language interaction, eliminating dependency on operations research specialists. The system combines language models with an optimization toolbox to dynamically adapt models to changing business conditions while maintaining solution quality and interpretability.
AINeutralarXiv – CS AI · May 276/10
🧠Researchers introduced FrontierOR, a benchmark that tests whether leading LLMs can design efficient optimization algorithms for real-world large-scale problems. The evaluation of seven models reveals significant limitations: even frontier models outperform Gurobi (a standard solver) in only 31% of cases, highlighting a substantial gap between LLM capabilities in formulation and practical algorithmic optimization.
AINeutralarXiv – CS AI · May 276/10
🧠Researchers introduce ORLoopBench, a benchmark suite that evaluates large language models on Operations Research tasks through an iterative solver-in-the-loop process rather than one-shot code generation. The framework enables models to debug infeasible mathematical models by inspecting constraint conflicts and repairing formulations, with an 8B model achieving 95.3% success on LP repair tasks—outperforming frontier APIs at 92.4%.
AIBullisharXiv – CS AI · Mar 36/103
🧠Researchers have developed ViTSP, a framework that uses pre-trained vision language models to solve large-scale Traveling Salesman Problems with average optimality gaps of just 0.24%. The system outperforms existing learning-based methods and reduces gaps by 3.57% to 100% compared to the best heuristic solver LKH-3 on instances with over 10,000 nodes.
AIBearisharXiv – CS AI · Feb 276/106
🧠Researchers introduced ConstraintBench, a new benchmark testing whether large language models can directly solve constrained optimization problems without external solvers. The study found that even the best frontier models only achieve 65% constraint satisfaction, with feasibility being a bigger challenge than optimality.