AINeutralarXiv – CS AI · 7h ago6/10
🧠
PlanarBench: Evaluating LLM Spatial Reasoning via Planar Graph Drawing
Researchers introduce PlanarBench, a benchmark that evaluates large language models' spatial reasoning abilities by testing whether they can draw planar graphs as ASCII art from edge lists. Testing 91 models on 199 non-isomorphic connected planar graphs reveals that edge count—not node count—is the dominant difficulty predictor, challenging assumptions in prior LLM graph benchmarking methodologies.