AIBullisharXiv โ CS AI ยท 3d ago7/10
๐ง
The Missing Memory Hierarchy: Demand Paging for LLM Context Windows
Researchers developed Pichay, a demand paging system that treats LLM context windows like computer memory with hierarchical caching. The system reduces context consumption by up to 93% in production by evicting stale content and managing memory more efficiently, addressing fundamental scalability issues in AI systems.