AINeutralarXiv โ CS AI ยท 7h ago7/10
๐ง
NanoKnow: How to Know What Your Language Model Knows
Researchers release NanoKnow, a benchmark dataset that reveals how large language models acquire and encode knowledge by leveraging nanochat's fully transparent pre-training data. The study demonstrates that LLM accuracy depends heavily on answer frequency in training data, and that parametric knowledge and external evidence serve complementary roles in model outputs.