An NLP-Driven Framework for Curriculum-Labor Market Alignment: Schema-Constrained LLM Extraction, ESCO-Anchored Semantic Matching, and Multi-Dimensional Gap Quantification
Researchers present an NLP framework that uses large language models and semantic matching to extract competencies from educational curricula and align them with labor-market demands. Applied to a UAE university's computer science program, the system identified significant gaps in general skills and algorithms while finding near-zero gaps in AI/data science, demonstrating a scalable approach to curriculum-labor market alignment.
This research addresses a critical operational challenge in higher education: ensuring that academic programs produce graduates with skills employers actually seek. The framework combines multiple NLP techniques—schema-constrained prompting, semantic embedding alignment via SBERT, and inter-model adjudication—to systematize what has traditionally been manual, subjective curriculum review. The use of the ESCO taxonomy as a grounding mechanism is particularly significant, as it provides a shared vocabulary across educational and labor-market domains, enabling reproducible comparisons rather than ad-hoc assessments.
The application to UAE University's computer science program reveals actionable intelligence: while the program achieves strong coverage in AI and data science (despite only 1.8% gap against 38.6% supply), it underdelivers in transversal skills (25% gap) and foundational algorithms (13.8% gap). This mismatch suggests that institutions can use such frameworks to identify where curriculum redesign is needed. The methodology's high reliability metrics—Cohen's kappa of 0.79, perfect schema conformance, and 100% document completeness—indicate the approach is production-ready.
For educational institutions globally, this framework offers a data-driven alternative to periodic accreditation reviews. For employers, it provides transparency into what graduates have been trained on before hiring. For policymakers, it enables evidence-based workforce planning. The five-scope analysis ranging from core courses to probability-weighted student trajectories demonstrates flexibility across curriculum structures, making the approach generalizable across programs and institutions.
- →Multi-stage LLM framework with ESCO semantic grounding extracts 400 competencies from an 85-course curriculum with 79% inter-model agreement and perfect schema conformance.
- →Supply-demand analysis reveals 25% gap in general skills but only 1.8% gap in AI/data science, indicating curriculum misalignment with employer priorities in non-technical competencies.
- →SBERT semantic matching against controlled vocabulary enables reproducible, scalable curriculum-labor market alignment without manual review.
- →Framework achieves 100% document-level completeness and combines three validation mechanisms to ensure extraction reliability and coverage.
- →Methodology is generalizable across educational institutions and labor markets, enabling evidence-based curriculum redesign informed by real job postings.