AIBullisharXiv – CS AI · 14h ago7/10
🧠
GPIC: A Giant Permissive Image Corpus for Visual Generation
Stanford researchers have released GPIC, a massive image dataset containing 28 trillion pixels across 100M training examples with permissive licensing for both research and commercial use. The dataset addresses a critical bottleneck in visual generative modeling by providing a large, safety-filtered, deduplicated corpus hosted on Hugging Face with accompanying benchmarks and baseline models.
🏢 Hugging Face