#sql-generation News & Analysis

5 articles tagged with #sql-generation. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

5 articles

AINeutralarXiv – CS AI · Jun 26/10

🧠

BADGER: Bridging Agentic and Deterministic Evaluation for Generative Enterprise Reasoning

Merkle has developed BADGER, a unified evaluation framework that combines text-to-SQL assessment with agentic behavior evaluation for enterprise AI systems. The framework achieves substantial agreement with human expert judgment (Cohen's kappa=0.717) and outperforms six competing evaluation approaches, addressing a critical gap in production-grade AI system assessment.

AIBullisharXiv – CS AI · May 116/10

🧠

Towards Autonomous Business Intelligence via Data-to-Insight Discovery Agent

Researchers introduce AIDA, an autonomous agent framework designed to transform complex enterprise data into actionable business insights by combining large language models with a domain-specific language and reinforcement learning. The system outperforms traditional workflow-based approaches in analyzing multi-dimensional retail data, demonstrating the potential for AI-driven autonomous intelligence in enterprise business intelligence systems.

AINeutralarXiv – CS AI · Mar 36/104

🧠

From Conversation to Query Execution: Benchmarking User and Tool Interactions for EHR Database Agents

Researchers introduced EHR-ChatQA, a new benchmark for testing AI agents that interact with Electronic Health Record databases through natural language queries. The benchmark reveals significant reliability gaps in current state-of-the-art LLMs, with success rates dropping substantially when consistency across multiple trials is required.

AIBullisharXiv – CS AI · Feb 276/107

🧠

Knowledge Distillation with Structured Chain-of-Thought for Text-to-SQL

Researchers propose Struct-SQL, a knowledge distillation framework that improves Small Language Models for Text-to-SQL tasks by using structured Chain-of-Thought reasoning instead of unstructured approaches. The method achieves an 8.1% improvement over baseline distillation, primarily by reducing syntactic errors through formal query execution plan blueprints.

AINeutralarXiv – CS AI · Mar 54/10

🧠

SpotIt: Evaluating Text-to-SQL Evaluation with Formal Verification

Researchers introduce SpotIt, a new evaluation method for Text-to-SQL systems that uses formal verification to find database instances where generated queries differ from ground-truth queries. Testing on the BIRD dataset revealed that current test-based evaluation methods often miss differences between generated and correct SQL queries.