y0news
← Feed
Back to feed
🧠 AI NeutralImportance 5/10

Characterising LLM-Generated Competency Questions: a Cross-Domain Empirical Study using Open and Closed Models

arXiv – CS AI|Reham Alharbi, Valentina Tamma, Terry R. Payne, Jacopo de Berardinis|
🤖AI Summary

Researchers conducted a systematic cross-domain study evaluating how large language models generate Competency Questions (CQs)—natural language requirements for ontology engineering. Using both open-source models (Llama, KimiK2) and proprietary systems (GPT-4, Gemini 2.5), they identified measurable differences in readability, relevance, and structural complexity, revealing that LLM performance varies significantly by use case.

Analysis

This research addresses a practical gap in understanding how generative AI can scale ontology engineering, a traditionally manual process requiring domain experts. Competency Questions form the foundation of requirement elicitation in knowledge representation systems, and automating their generation could democratize access to sophisticated knowledge modeling. The study's empirical approach—establishing quantitative measures for cross-model comparison—provides the rigor needed to guide practitioners in model selection.

Ontology engineering has historically been labor-intensive and expertise-dependent, limiting adoption across organizations. The emergence of capable LLMs presents an opportunity to reduce this burden, though the heterogeneous landscape of available models makes informed selection critical. This research contextualizes that selection challenge by benchmarking both commercial and open-source models across multiple dimensions, revealing that no single model universally excels.

For developers and knowledge engineers, these findings suggest that model choice should be use-case dependent rather than based on generic capability claims. The inclusion of open models (Llama variants, KimiK2) demonstrates that smaller, accessible models can produce competitive results in specific domains, potentially reducing infrastructure costs and vendor lock-in concerns. The variation in performance profiles implies that hybrid approaches—combining models optimized for different CQ properties—might yield superior outcomes.

Future work likely involves expanding domain coverage, incorporating user evaluation to validate computational measures, and developing frameworks that match model selection to specific ontology engineering requirements. As enterprises increasingly adopt knowledge graphs and semantic systems, systematic approaches to automating their foundational requirements become strategically valuable.

Key Takeaways
  • Competency Question generation via LLMs is viable but performance varies significantly across use cases and model architectures.
  • Open-source models like Llama 3 can match or approach closed-model performance in ontology requirement elicitation tasks.
  • Quantitative measures for readability, relevance, and structural complexity enable systematic model comparison beyond benchmark scores.
  • Use-case-specific generation profiles suggest practitioners should conduct tailored model evaluation rather than relying on generic capability rankings.
  • Automating CQ generation could democratize ontology engineering by reducing manual effort and broadening stakeholder participation.
Mentioned in AI
Models
GPT-4OpenAI
GeminiGoogle
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles