y0news
AnalyticsDigestsSourcesRSSAICrypto
#task-parallel1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 10h ago7/10
๐Ÿง 

Justitia: Fair and Efficient Scheduling of Task-parallel LLM Agents with Selective Pampering

Justitia is a new scheduling system for task-parallel LLM agents that optimizes GPU server performance through selective resource allocation based on completion order prediction. The system uses memory-centric cost quantification and virtual-time fair queuing to achieve both efficiency and fairness in LLM serving environments.

๐Ÿข Meta