The Compressive Knowledge Graph Hypothesis: Which Graph Facts Matter for Scientific Hypothesis Generation?
Researchers evaluated how knowledge graphs (KGs) influence hypothesis generation in large language models across multiple models, finding that compact subgraphs often perform comparably to full graphs. The study reveals that KG utility is selective and model-dependent, with useful signal often recoverable from structured, compressed subsets rather than complete local graphs.
This research addresses a fundamental question in AI systems: how much structured information actually matters when language models generate scientific hypotheses? The team tested battery materials hypothesis generation across three major models—Mistral-7B, Llama-3.1-70B, and Gemini 2.5 Flash—by systematically removing or modifying knowledge graph components to measure impact.
The findings challenge assumptions about knowledge graph necessity. While KGs do influence outputs, models retain significant content from internal priors alone. More surprisingly, compact top-k subgraphs frequently matched full-graph performance, suggesting substantial redundancy in structured knowledge representation. This efficiency holds even when outcome-critical facts are excluded, indicating models may prioritize certain semantic patterns over others.
The redundancy isn't tied to specific ranking methods—random and topology-based subsets recovered comparable signal to semantic rankings. This has immediate implications for AI infrastructure: organizations building KG-augmented systems could potentially reduce computational and storage costs without sacrificing output quality.
For the broader AI field, this demonstrates that efficiency gains in structured knowledge retrieval are achievable without sophisticated ranking algorithms. However, the model-dependent nature of results suggests no universal compression strategy exists. The work points toward adaptive KG systems that tailor graph density and structure to specific model architectures and use cases, potentially enabling faster inference while maintaining scientific reasoning quality.
- →Compact knowledge graph subsets often match full-graph performance in scientific hypothesis generation across multiple LLM architectures.
- →KG utility is selective and model-dependent, with language models recovering substantial structured content from internal priors.
- →Useful graph signal is not unique to semantic ranking rules; random and topology-based subsets achieve comparable results.
- →Models demonstrate inherent redundancy in their treatment of structured knowledge, suggesting potential efficiency improvements in KG-augmented systems.
- →No universal compression strategy exists, indicating need for adaptive approaches tailored to specific model architectures.