AIBullisharXiv – CS AI · 10h ago7/10
🧠
The Metanym Game: A Self-Contained, Self-Consistent LLM Peer-Community Benchmark for Structural Intelligence
Researchers introduce the Metanym Game, a novel LLM benchmark that measures structural intelligence through competitive word games where AI models generate and evaluate content without pre-existing test sets. Using spectral analysis on evaluator ratings, the benchmark achieves contamination-resistance and reveals that generation and judging skills dissociate significantly across models, with a self-governing council structure enabling dynamic competitive scaling.