🧠 AI⚪ NeutralImportance 5/10

Characterising LLM-Generated Competency Questions: a Cross-Domain Empirical Study using Open and Closed Models

arXiv – CS AI|Reham Alharbi, Valentina Tamma, Terry R. Payne, Jacopo de Berardinis|April 20, 2026 at 04:00 AM

🤖AI Summary

Researchers conducted a systematic cross-domain study evaluating how large language models generate Competency Questions (CQs)—natural language requirements for ontology engineering. Using both open-source models (Llama, KimiK2) and proprietary systems (GPT-4, Gemini 2.5), they identified measurable differences in readability, relevance, and structural complexity, revealing that LLM performance varies significantly by use case.

Analysis

This research addresses a practical gap in understanding how generative AI can scale ontology engineering, a traditionally manual process requiring domain experts. Competency Questions form the foundation of requirement elicitation in knowledge representation systems, and automating their generation could democratize access to sophisticated knowledge modeling. The study's empirical approach—establishing quantitative measures for cross-model comparison—provides the rigor needed to guide practitioners in model selection.

Ontology engineering has historically been labor-intensive and expertise-dependent, limiting adoption across organizations. The emergence of capable LLMs presents an opportunity to reduce this burden, though the heterogeneous landscape of available models makes informed selection critical. This research contextualizes that selection challenge by benchmarking both commercial and open-source models across multiple dimensions, revealing that no single model universally excels.

For developers and knowledge engineers, these findings suggest that model choice should be use-case dependent rather than based on generic capability claims. The inclusion of open models (Llama variants, KimiK2) demonstrates that smaller, accessible models can produce competitive results in specific domains, potentially reducing infrastructure costs and vendor lock-in concerns. The variation in performance profiles implies that hybrid approaches—combining models optimized for different CQ properties—might yield superior outcomes.

Future work likely involves expanding domain coverage, incorporating user evaluation to validate computational measures, and developing frameworks that match model selection to specific ontology engineering requirements. As enterprises increasingly adopt knowledge graphs and semantic systems, systematic approaches to automating their foundational requirements become strategically valuable.

Key Takeaways

→Competency Question generation via LLMs is viable but performance varies significantly across use cases and model architectures.
→Open-source models like Llama 3 can match or approach closed-model performance in ontology requirement elicitation tasks.
→Quantitative measures for readability, relevance, and structural complexity enable systematic model comparison beyond benchmark scores.
→Use-case-specific generation profiles suggest practitioners should conduct tailored model evaluation rather than relying on generic capability rankings.
→Automating CQ generation could democratize ontology engineering by reducing manual effort and broadening stakeholder participation.

Mentioned in AI

Models

GPT-4OpenAI

GeminiGoogle

#llm-evaluation #ontology-engineering #competency-questions #knowledge-representation #open-source-models #generative-ai #requirements-elicitation #cross-domain-analysis

Read Original →via arXiv – CS AI

Act on this with AI

Stay ahead of the market.

Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.

Connect Wallet to AI →How it works

AIMay 6

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

AIMay 6

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

AIMay 6

Characterising LLM-Generated Competency Questions: a Cross-Domain Empirical Study using Open and Closed Models

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

S&P 500 and NASDAQ hit record highs as AI chip stocks surge