AINeutralarXiv โ CS AI ยท 5h ago
๐ง
SpotIt: Evaluating Text-to-SQL Evaluation with Formal Verification
Researchers introduce SpotIt, a new evaluation method for Text-to-SQL systems that uses formal verification to find database instances where generated queries differ from ground-truth queries. Testing on the BIRD dataset revealed that current test-based evaluation methods often miss differences between generated and correct SQL queries.