AIBearisharXiv โ CS AI ยท Feb 276/106
๐ง
ConstraintBench: Benchmarking LLM Constraint Reasoning on Direct Optimization
Researchers introduced ConstraintBench, a new benchmark testing whether large language models can directly solve constrained optimization problems without external solvers. The study found that even the best frontier models only achieve 65% constraint satisfaction, with feasibility being a bigger challenge than optimality.