AINeutralarXiv – CS AI · 15h ago6/10
🧠
VISTA: An End-to-End Benchmark for Visual Spec-to-Web-App Coding Agents
VISTA is a new benchmark for evaluating how well AI agents can generate functional web applications from visual specifications and text descriptions. The benchmark introduces five different testing conditions with varying levels of design detail and technology stack constraints, using manual annotations and multi-modal evaluation metrics to assess both visual fidelity and functional correctness.