AINeutralarXiv โ CS AI ยท 10h ago6/10
๐ง
CONDESION-BENCH: Conditional Decision-Making of Large Language Models in Compositional Action Space
Researchers introduce CONDESION-BENCH, a new benchmark for evaluating how large language models make decisions in complex, real-world scenarios with compositional actions and conditional constraints. The benchmark addresses limitations in existing decision-making frameworks by incorporating variable-level, contextual, and allocation-level restrictions that better reflect actual decision-making environments.