#operations-research News & Analysis

18 articles tagged with #operations-research. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

18 articles

AIBullisharXiv – CS AI · Apr 207/10

🧠

Bilevel Optimization of Agent Skills via Monte Carlo Tree Search

Researchers propose a bilevel optimization framework using Monte Carlo Tree Search to systematically improve LLM agent skills—structured collections of instructions, tools, and resources. The framework optimizes both skill structure and component content simultaneously, demonstrating performance improvements on Operations Research tasks and addressing a previously unsolved challenge in agent design optimization.

AIBullisharXiv – CS AI · Apr 147/10

🧠

Deep Learning for Sequential Decision Making under Uncertainty: Foundations, Frameworks, and Frontiers

A comprehensive tutorial examines how deep learning complements operations research and optimization for sequential decision-making under uncertainty. The framework positions AI not as a replacement for traditional optimization but as an enhancement, with applications across supply chains, healthcare, energy, and autonomous systems.

AINeutralarXiv – CS AI · Jun 256/10

🧠

UC-Search: Risk-Aware Test-Time Search for Delayed Constrained Time-Series Control

UC-Search is a model-agnostic test-time algorithm that combines time-series forecasting with constrained decision-making under uncertainty. The approach uses beam search and Monte Carlo tree search variants to optimize delayed control decisions while respecting feasibility constraints, demonstrating measurable improvements over existing methods like CEM and MPPI across inventory control and financial forecasting benchmarks.

AINeutralarXiv – CS AI · Jun 235/10

🧠

Joint Air Traffic Flow and Capacity Management via Answer Set Programming

Researchers introduce a joint air traffic flow and capacity management model using Answer Set Programming that simultaneously optimizes aircraft trajectories and sector configurations. The ASP approach outperforms traditional Mixed Integer Programming methods and remains competitive with heuristics, demonstrating potential improvements in balancing flight demand with available airspace capacity.

AIBearisharXiv – CS AI · Jun 196/10

🧠

ORAgentBench: Can LLM Agents Solve Challenging Operations Research Tasks End to End?

Researchers introduced ORAgentBench, a benchmark testing whether AI agents can autonomously solve complex operations research tasks end-to-end. Testing 14 frontier agent-model configurations revealed significant limitations: the best agent solved only 35.51% of tasks and 20.59% of hard tasks, with failures stemming from missed operational rules, weak solution construction, and insufficient optimization—indicating AI agents remain far from production-ready OR work.

AINeutralarXiv – CS AI · Jun 106/10

🧠

Bellman-Taylor Score Decoding for Markov Decision Processes with State-Dependent Feasible Action Sets

Researchers propose Bellman-Taylor score decoding, a novel deep reinforcement learning framework designed to handle Markov decision processes with state-dependent action constraints common in operations research. The method decouples policy learning into a Euclidean score space while maintaining feasibility through an action decoder, enabling standard DRL algorithms to optimize complex systems like queueing networks without architectural modifications.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Learning to replenish: A hybrid deep reinforcement learning for dynamic inventory management in the pharmaceutical supply chains

Researchers propose a hybrid deep reinforcement learning algorithm (A3C DPPO) to optimize inventory replenishment in pharmaceutical supply chains, addressing challenges of unpredictable demand, variable lead times, and product shelf-life constraints. The approach demonstrates cost reductions compared to benchmark methods while maintaining service levels, with validation using real-world pharmaceutical data.

AIBullisharXiv – CS AI · Jun 46/10

🧠

Beyond Objective Equivalence: Constraint Injection for LLM-Based Optimization Modeling on Vehicle Routing Problems

Researchers propose constraint injection, a novel verification technique that detects missing or spurious constraints in LLM-generated optimization code. VRPCoder, an 8B model fine-tuned with this method, achieves 93% accuracy on vehicle routing problems, significantly outperforming GPT and Claude models on constraint-dense combinatorial optimization tasks.

🧠 Claude🧠 Gemini

AINeutralarXiv – CS AI · Jun 26/10

🧠

Position Paper: Post-Solve Robustness in Decision Engines: Feasible Regions and Smoothness Under Perturbations

Researchers propose a post-solve robustness framework for Mixed-Integer Linear Programming decision engines, addressing the gap between theoretical optimal solutions and real-world deployment where parameter perturbations can invalidate feasibility. The work calls for standardized auditing of solved problems to measure how solutions perform under small cost, demand, and resource variations.

AINeutralarXiv – CS AI · May 296/10

🧠

Opt-Verifier: Unleashing the Power of LLMs for Optimization Modeling via Dual-Side Verification

Researchers introduce Opt-Verifier, an LLM-based framework that improves automated mathematical optimization modeling by verifying generated models from both structural and solution perspectives. The dual-side verification approach addresses a critical gap in existing systems by validating constraints, variables, and solution validity, achieving over 20% accuracy improvements on benchmark tests.

AIBullisharXiv – CS AI · May 296/10

🧠

OptSkills: Learning Generalizable Optimization Skills from Problem Archetypes via Cluster-Based Distillation

OptSkills, a new AI system, advances automated optimization problem-solving by clustering problems by underlying mathematical archetypes rather than surface narratives, achieving 68.27% accuracy on diverse benchmarks and outperforming DeepSeek-V3.2-Thinking on large-scale problems. The system uses skill distillation and trajectory learning to improve generalization across both known and novel problem types.

AINeutralarXiv – CS AI · May 286/10

🧠

OR-Space: A Full-Lifecycle Workspace Benchmark for Industrial Optimization Agents

Researchers introduce OR-Space, a comprehensive benchmark for evaluating large language model agents in industrial operations research workflows. Unlike existing benchmarks that focus on single-stage problem translation, OR-Space tests agents across persistent multi-artifact workspaces with three task modes—building optimization models, revising them under changing requirements, and explaining solutions—to assess real-world reliability and practical readiness.

AINeutralarXiv – CS AI · May 285/10

🧠

An Enhanced Large Neighborhood Search Approach for the Capacitated Facility Location Problem with Incompatible Customers

Researchers have developed an enhanced Large Neighborhood Search (LNS) algorithm to solve a variant of the capacitated facility location problem that incorporates customer incompatibilities, where certain customer pairs cannot share the same facility. The new method employs hybrid destroy operators and exact solvers, achieving superior performance over existing metaheuristics on all benchmark instances.

AIBullisharXiv – CS AI · May 286/10

🧠

Democratizing Large-Scale Re-Optimization with LLM-Guided Model Patches

Researchers present an LLM-powered framework that enables non-expert end users to re-optimize deployed decision-support systems through natural language interaction, eliminating dependency on operations research specialists. The system combines language models with an optimization toolbox to dynamically adapt models to changing business conditions while maintaining solution quality and interpretability.

AINeutralarXiv – CS AI · May 276/10

🧠

FrontierOR: Benchmarking LLMs' Capacity for Efficient Algorithm Design in Large-Scale Optimization

Researchers introduced FrontierOR, a benchmark that tests whether leading LLMs can design efficient optimization algorithms for real-world large-scale problems. The evaluation of seven models reveals significant limitations: even frontier models outperform Gurobi (a standard solver) in only 31% of cases, highlighting a substantial gap between LLM capabilities in formulation and practical algorithmic optimization.

AINeutralarXiv – CS AI · May 276/10

🧠

ORLoopBench: Solver-in-the-Loop Benchmarks for Self-Correction and Behavioral Rationality in Operations Research

Researchers introduce ORLoopBench, a benchmark suite that evaluates large language models on Operations Research tasks through an iterative solver-in-the-loop process rather than one-shot code generation. The framework enables models to debug infeasible mathematical models by inspecting constraint conflicts and repairing formulations, with an 8B model achieving 95.3% success on LP repair tasks—outperforming frontier APIs at 92.4%.

AIBullisharXiv – CS AI · Mar 36/103

🧠

ViTSP: A Vision Language Models Guided Framework for Solving Large-Scale Traveling Salesman Problems

Researchers have developed ViTSP, a framework that uses pre-trained vision language models to solve large-scale Traveling Salesman Problems with average optimality gaps of just 0.24%. The system outperforms existing learning-based methods and reduces gaps by 3.57% to 100% compared to the best heuristic solver LKH-3 on instances with over 10,000 nodes.

AIBearisharXiv – CS AI · Feb 276/106

🧠

ConstraintBench: Benchmarking LLM Constraint Reasoning on Direct Optimization

Researchers introduced ConstraintBench, a new benchmark testing whether large language models can directly solve constrained optimization problems without external solvers. The study found that even the best frontier models only achieve 65% constraint satisfaction, with feasibility being a bigger challenge than optimality.