Reasoning in a Combinatorial and Constrained World: Benchmarking LLMs on Natural-Language Combinatorial Optimization
Researchers introduced NLCO, a benchmark for evaluating large language models on natural-language combinatorial optimization problems without external solvers or code generation. Testing across modern LLMs reveals that while high-performing models handle small instances well, performance degrades significantly as problem complexity increases, with graph-structured and bottleneck-objective problems proving particularly challenging.