y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#planning-capabilities News & Analysis

1 article tagged with #planning-capabilities. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AINeutralarXiv – CS AI · 7h ago6/10
🧠

LLM-WikiRace Benchmark: How Far Can LLMs Plan over Real-World Knowledge Graphs?

Researchers introduce LLM-WikiRace, a benchmark that tests large language models' planning and reasoning abilities by requiring them to navigate Wikipedia links from a source to target page. While frontier models like Gemini-3 achieve superhuman performance on easy tasks, success rates plummet to 23% on hard difficulty, revealing significant limitations in long-horizon planning and recovery from failures.

🧠 GPT-5🧠 Claude🧠 Opus