Semantic Voting: Execution-Grounded Consensus for LLM Code Generation
Researchers demonstrate that execution-based voting methods for LLM code generation significantly outperform text-based majority voting by 18-52 percentage points. The study reveals that input quality—particularly sketch-based generation—matters far more than the aggregation algorithm itself, challenging assumptions about how to select optimal code outputs.