y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#agent-planning News & Analysis

1 article tagged with #agent-planning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AINeutralarXiv – CS AI · 7h ago6/10
🧠

CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposing

Researchers introduce CreativityBench, a benchmark with 4K entities and 150K+ affordance annotations to evaluate how well large language models can creatively repurpose tools by reasoning about their properties rather than canonical uses. Evaluations across 10 state-of-the-art LLMs reveal significant limitations: models struggle to identify correct parts, affordances, and physical mechanisms needed for non-obvious solutions, with performance gains from scaling and reasoning strategies like Chain-of-Thought proving limited.