y0news
← Feed
Back to feed
🧠 AI NeutralImportance 7/10

The Impact of AI-Generated Text on the Internet

arXiv – CS AI|Jonas Dolezal, Sawood Alam, Mark Graham, Maty Bohacek|
🤖AI Summary

A comprehensive study using Internet Archive data reveals that approximately 35% of newly published websites by mid-2025 contain AI-generated or AI-assisted text, up from zero before ChatGPT's launch in late 2022. While the research finds statistical support for concerns about reduced semantic diversity and increased positive sentiment bias, it contradicts public fears about declining factual accuracy and stylistic diversity, highlighting a significant gap between perceived and measured impacts of AI-generated content.

Analysis

The rapid integration of AI text generation into internet content creation represents a watershed moment for digital information ecosystems. This study provides empirical data to ground what has largely been speculative discussion about AI's impact on online content, finding that one-third of new websites now leverage AI assistance—a dramatic shift accomplished in roughly two years. The research methodology, leveraging Internet Archive snapshots from 2022-2025, establishes a credible baseline for tracking this transformation.

The divergence between measured outcomes and public perception reveals important dynamics in how AI adoption shapes both content and discourse. The finding that semantic diversity declines while positive sentiment increases suggests AI models may be converging toward similar linguistic patterns and favorable framings. However, the absence of statistical evidence for degraded factual accuracy or stylistic diversity contradicts prevailing narratives about AI-generated content quality. This gap likely reflects both the actual capabilities of current AI systems and selection bias—organizations using AI text generation may do so for routine content where accuracy impacts are minimal.

The study uncovers a critical pattern: skepticism about AI correlates strongly with belief in negative impacts, while frequent AI users exhibit more measured concerns. This suggests that direct experience with AI tools moderates catastrophic perceptions. For technology platforms, content creators, and information consumers, these findings indicate that AI-generated content's primary documented risk involves semantic homogenization rather than factual degradation. Publishers and platforms should focus mitigation efforts on maintaining content diversity rather than wholesale restrictions on AI assistance, which now comprises a substantial portion of new internet content.

Key Takeaways
  • 35% of newly published websites by mid-2025 contain AI-generated or AI-assisted text, representing exponential growth from zero in late 2022.
  • AI-generated text correlates with reduced semantic diversity and increased positive sentiment bias, but not with decreased factual accuracy.
  • Public perception significantly overestimates negative impacts of AI-generated content compared to measured outcomes.
  • Frequent AI users hold more moderate views about AI's negative impacts than infrequent users or AI skeptics.
  • Content homogenization poses a greater documented risk than factual inaccuracy in AI-generated internet content.
Mentioned in AI
Models
ChatGPTOpenAI
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles