#text-to-sql News & Analysis

15 articles tagged with #text-to-sql. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

15 articles

AIBullisharXiv – CS AI · Jun 117/10

🧠

TAHOE: Text-to-SQL with Automated Hint Optimization from Experience

Researchers introduce Tahoe, a system that optimizes LLM-based Text-to-SQL conversion through dynamic prompt engineering rather than model retraining. By consolidating debugging traces into reusable hints and modeling conflicting user intents as strategies, Tahoe increases query pass rates from 62% to 79% on Spider 2.0-Snow benchmarks while maintaining compatibility across weaker model backbones.

🧠 GPT-5

AINeutralarXiv – CS AI · Jun 97/10

🧠

UniQL: Towards Dialect-Universal Benchmarking for Text-to-SQL

UniQL introduces a new benchmark for evaluating text-to-SQL models across 16 different SQL dialects, addressing a critical gap where existing benchmarks focus primarily on SQLite. The study reveals that current large language models struggle with cross-dialect generalization, performing inconsistently across different database systems despite success on SQLite.

AIBullisharXiv – CS AI · Jun 27/10

🧠

APEX-SQL: Talking to the data via Agentic Exploration for Text-to-SQL

Researchers introduce APEX-SQL, an agentic framework that improves Text-to-SQL systems by using hypothesis-verification loops and real data exploration instead of static schema representations. The system achieves 70.65% execution accuracy on BIRD and 51.01% on Spider 2.0-Snow benchmarks, demonstrating significant performance gains for enterprise database query generation.

AINeutralarXiv – CS AI · Jun 236/10

🧠

SQLConductor: Search-to-Policy Learning for Step-wise Text-to-SQL Orchestration

SQLConductor is a new AI framework that improves Text-to-SQL systems—tools that convert natural language queries into database commands—by using adaptive, step-wise orchestration rather than fixed pipelines. The system achieves 73.2% execution accuracy on complex database queries while using smaller, frozen models, suggesting significant efficiency gains for database accessibility applications.

AINeutralarXiv – CS AI · Jun 86/10

🧠

Progress-SQL: Improving Reinforcement Learning for Text-to-SQL via Progressive Rewards

Researchers introduce Progress-SQL, a reinforcement learning framework that improves large language models' ability to convert natural language queries into SQL code through multi-turn refinement with progressive reward signals. The method uses an Oracle-guided Diagnostic Tree to provide clause-level feedback and demonstrates consistent performance improvements across multiple benchmark datasets.

AINeutralarXiv – CS AI · Jun 26/10

🧠

SIRIUS-SQL: Anchoring Multi-Candidate Text-to-SQL in Execution Feedback

SIRIUS-SQL introduces a multi-candidate approach to Text-to-SQL generation that addresses redundancy, execution error classification, and selector limitations through difficulty-smoothing reinforcement learning, targeted repair mechanisms, and hybrid confidence-gated selection. The system achieves 75.88% accuracy on BIRD dev and 91.20% on SPIDER test, surpassing previous state-of-the-art multi-candidate systems.

AINeutralarXiv – CS AI · May 296/10

🧠

EviLink: Multi-Path Schema Linking with Uncertainty-Guided Evidence Acquisition for Large-Scale Text-to-SQL

EviLink is a new AI framework that improves Text-to-SQL systems by treating schema linking as an uncertainty-aware process across multiple SQL paths rather than a single deterministic selection. The approach balances schema completeness, relevance, and computational cost, achieving 90.15% field-level recall on Spider2-Snow while using fewer tokens than existing methods.

AINeutralarXiv – CS AI · May 296/10

🧠

CORE-T: COherent REtrieval of Tables for Text-to-SQL

CORE-T introduces a training-free framework for improving table retrieval in text-to-SQL systems by combining dense retrieval with LLM-generated metadata and compatibility caching. The approach achieves significant performance gains—up to 22.7 points in table-selection F1 and 24.4 points in multi-table execution accuracy—while reducing inference tokens by 64-76% compared to LLM-intensive alternatives.

AIBullisharXiv – CS AI · May 116/10

🧠

CA-SQL: Complexity-Aware Inference Time Reasoning for Text-to-SQL via Exploration and Compute Budget Allocation

Researchers introduce CA-SQL, an advanced Text-to-SQL pipeline that dynamically allocates computational resources based on task complexity to improve LLM reasoning. The method achieves state-of-the-art performance on the BIRD benchmark's challenging tier using only GPT-4o-mini, outperforming larger models and demonstrating the efficiency gains possible through intelligent inference-time optimization.

🧠 GPT-4

AINeutralarXiv – CS AI · Apr 156/10

🧠

LatentRefusal: Latent-Signal Refusal for Unanswerable Text-to-SQL Queries

Researchers propose LatentRefusal, a safety mechanism for LLM-based text-to-SQL systems that detects unanswerable queries by analyzing intermediate hidden activations rather than relying on output-level instruction following. The approach achieves 88.5% F1 score across four benchmarks while adding minimal computational overhead, addressing a critical deployment challenge in AI systems that generate executable code.

AIBullisharXiv – CS AI · Mar 96/10

🧠

CBR-to-SQL: Rethinking Retrieval-based Text-to-SQL using Case-based Reasoning in the Healthcare Domain

Researchers introduce CBR-to-SQL, a new framework using Case-Based Reasoning to improve natural language-to-SQL translation for healthcare databases. The system addresses limitations of standard RAG approaches by using two-stage retrieval and abstract case templates, achieving state-of-the-art results on medical datasets.

AIBullisharXiv – CS AI · Feb 276/107

🧠

Knowledge Distillation with Structured Chain-of-Thought for Text-to-SQL

Researchers propose Struct-SQL, a knowledge distillation framework that improves Small Language Models for Text-to-SQL tasks by using structured Chain-of-Thought reasoning instead of unstructured approaches. The method achieves an 8.1% improvement over baseline distillation, primarily by reducing syntactic errors through formal query execution plan blueprints.

AINeutralarXiv – CS AI · Mar 124/10

🧠

EvoSchema: Towards Text-to-SQL Robustness Against Schema Evolution

Researchers introduce EvoSchema, a comprehensive benchmark to test how well text-to-SQL AI models handle database schema changes over time. The study reveals that table-level changes significantly impact model performance more than column-level modifications, and proposes training methods to improve model robustness in dynamic database environments.

AINeutralarXiv – CS AI · Mar 54/10

🧠

SpotIt+: Verification-based Text-to-SQL Evaluation with Database Constraints

SpotIt+ is a new open-source tool that evaluates Text-to-SQL systems through verification-based testing, actively searching for database instances that reveal differences between generated and ground truth SQL queries. The tool incorporates constraint-mining that combines rule-based specification mining with LLM validation to generate more realistic test scenarios.

AINeutralarXiv – CS AI · Mar 54/10

🧠

SpotIt: Evaluating Text-to-SQL Evaluation with Formal Verification

Researchers introduce SpotIt, a new evaluation method for Text-to-SQL systems that uses formal verification to find database instances where generated queries differ from ground-truth queries. Testing on the BIRD dataset revealed that current test-based evaluation methods often miss differences between generated and correct SQL queries.