y0news
AnalyticsDigestsSourcesRSSAICrypto
#constraints1 article
1 articles
AINeutralarXiv โ€“ CS AI ยท 10h ago7/10
๐Ÿง 

CCTU: A Benchmark for Tool Use under Complex Constraints

Researchers introduce CCTU, a new benchmark for evaluating large language models' ability to use tools under complex constraints. The study reveals that even state-of-the-art LLMs achieve less than 20% task completion rates when strict constraint adherence is required, with models violating constraints in over 50% of cases.