PrimeKG-CL: A Continual Graph Learning Benchmark on Evolving Biomedical Knowledge Graphs
Researchers introduced PrimeKG-CL, a benchmark dataset for continual graph learning built from nine biomedical databases with 129K+ nodes and 8.1M+ edges across two temporal snapshots (2021-2023). The work evaluates how different machine learning strategies handle evolving biomedical knowledge graphs, revealing that decoder choice and learning strategy interact significantly and that standard metrics fail to distinguish between retaining valid facts and forgetting outdated ones.
PrimeKG-CL addresses a critical gap in machine learning research by providing the first realistic benchmark for continual graph learning on biomedical knowledge graphs. Unlike existing benchmarks built from synthetic random splits of static data, this dataset captures genuine temporal evolution with 5.83M edges added and 889K removed between June 2021 and July 2023, reflecting how real biomedical ontologies evolve asynchronously across independent update cycles. This realistic scenario is essential for training systems that support drug repurposing and clinical decision support, where outdated information poses genuine risks.
The research reveals important findings about how different machine learning architectures handle knowledge graph evolution. The study demonstrates that no single continual learning strategy performs optimally across all knowledge graph embedding decoders, challenging assumptions that best practices generalize universally. Critically, the analysis shows that standard evaluation metrics conflate two distinct capabilities: retaining facts that remain valid and forgetting deprecated information. This distinction only appears clearly with certain decoders like DistMult, suggesting current metrics may mask fundamental failures in knowledge forgetting.
The benchmark's multimodal node features and entity-type-grouped tasks enable more nuanced evaluation than generic KG benchmarks. However, the finding that a recent continual knowledge graph embedding framework (IncDE) failed to scale even with 350GB RAM indicates that practical deployment challenges remain significant. The release of data, code, and stratified splits enables the broader research community to develop more robust continual learning approaches for biomedical applications, where knowledge updates occur frequently and outdated information must be systematically removed.
- βPrimeKG-CL provides the first realistic continual graph learning benchmark with genuine temporal evolution from biomedical databases rather than synthetic splits.
- βNo single continual learning strategy performs best across all knowledge graph embedding decoders, indicating decoder-strategy interactions are critical design considerations.
- βStandard metrics conflate retention of valid facts with failure to forget deprecated knowledge, with this distinction only appearing clearly in certain decoder architectures.
- βMultimodal node features improve entity-level tasks by up to 60%, suggesting the importance of enriched representations in biomedical applications.
- βCurrent continual knowledge graph embedding frameworks face severe scalability challenges, failing to handle mid-scale tasks even with 350GB memory allocation.