🧠 AI⚪ NeutralImportance 6/10

From Graph Retrieval to Schema Realization: Counterfactual Validation for Text-to-SPARQL over Heterogeneous Knowledge Graphs

arXiv – CS AI|Yang Zhao, Chengxiao Dai, Yue Xiu, Dusit Niyato|June 2, 2026 at 04:00 AM

🤖AI Summary

SchemaForge, a new AI framework, improves text-to-SPARQL query generation over heterogeneous knowledge graphs by using schema-grounded validation. The system achieves 11.5 percentage points higher accuracy than existing baselines across four benchmarks, demonstrating practical advances in natural language to database query translation.

Analysis

SchemaForge addresses a critical challenge in knowledge graph question answering: translating natural language queries into executable SPARQL commands when dealing with multiple, differently-structured knowledge graphs. Traditional text-to-SPARQL systems assume a fixed, homogeneous graph schema, but real-world deployments involve fragmented data with varying predicates, entity types, and metadata availability. This gap between theoretical benchmarks and practical deployment has limited widespread adoption of KGQA systems in production environments.

The framework's innovation lies in its two-stage approach: weak graph evidence initially narrows the candidate graphs, while schema-specific validation determines whether a selected graph slice can actually support the required query structure. This counterfactual validation mechanism—checking whether answer sets logically satisfy the question—prevents the generation of syntactically correct but semantically invalid queries. On Spider4SPARQL, SchemaForge improved execution accuracy from 54.86% to 64.18%, an 11.5 percentage point gain that represents meaningful progress in a challenging domain.

For the AI and database communities, this work has implications for enterprise data integration. As organizations maintain multiple siloed databases and knowledge graphs, automated query translation across heterogeneous schemas becomes increasingly valuable. The ability to correctly allocate questions to appropriate graph sources (97% Top-3 accuracy) and generate valid queries (73% Top-1 accuracy) demonstrates practical utility. Schema-grounded query generation reduces hallucination and improves reliability—critical factors for production deployment where incorrect queries can return misleading results or fail entirely.

Future development should focus on scaling this approach to larger graph collections and exploring its applicability beyond Wikidata-style RDF systems to other structured data formats used in enterprise environments.

Key Takeaways

→SchemaForge improves text-to-SPARQL accuracy by 11.5 percentage points through schema-grounded, counterfactual validation across heterogeneous knowledge graphs
→The framework successfully allocates questions to correct graphs with 97% Top-3 accuracy, addressing a key practical limitation of existing KGQA systems
→Schema-slice alignment mechanism moves beyond syntax-only query generation to ensure semantic correctness and executability
→Performance gains demonstrate that weak graph evidence combined with strong schema-specific commitments reduces query hallucination in multi-graph environments
→Results generalize across four public benchmarks, suggesting the approach's robustness for diverse knowledge graph architectures

#knowledge-graphs #natural-language-processing #sparql-query-generation #semantic-web #kgqa #schema-alignment #ai-systems

Read Original →via arXiv – CS AI

Act on this with AI

Stay ahead of the market.

Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.

Connect Wallet to AI →How it works

AIMay 6

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

AIMay 6

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

AIMay 6

From Graph Retrieval to Schema Realization: Counterfactual Validation for Text-to-SPARQL over Heterogeneous Knowledge Graphs

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

S&P 500 and NASDAQ hit record highs as AI chip stocks surge