How Hyper-Datafication Impacts the Sustainability Costs in Frontier AI
A comprehensive study of 550,000 datasets from Hugging Face reveals that the AI industry's rapid scaling of data collection—termed 'hyper-datafication'—disproportionately shifts environmental, labor, and social costs to the Global South and precarious workers. The research identifies critical sustainability challenges in frontier AI development and proposes the Data PROOFS framework to mitigate representational harms, carbon footprint, and labor exploitation.