When New Generators Arrive: Lifelong Machine-Generated Text Attribution via Ridge Feature Transfer
Researchers propose RidgeFT, a machine learning framework that enables continuous identification of machine-generated text sources while preserving performance on previously learned generators. The method uses efficient closed-form updates and feature-stable analytics to balance adaptation to new language models with retention of old ones.
The emergence of increasingly sophisticated large language models presents a critical challenge for content authentication systems. As new generators continuously enter the market, attribution models face a fundamental tension: adapting to identify novel sources without degrading their ability to recognize previously seen ones. This research directly addresses this 'lifelong learning' problem in machine-generated text attribution, a field increasingly important for combating misinformation, protecting intellectual property, and maintaining accountability for AI-generated content.
RidgeFT distinguishes itself through an analytic approach that avoids storing exemplar data from previous generators—a memory-intensive practice that scales poorly as model diversity increases. Instead, the framework maintains compact class-wise statistics and employs closed-form ridge regression for efficient updates when new generators appear. By freezing the initial encoder and performing covariance calibration, RidgeFT suppresses irrelevant variations while preserving discriminative features.
For the AI and content moderation industry, this work carries significant implications. Platforms that host user-generated content increasingly need robust detection of AI-assisted or fully synthetic text. A scalable attribution system enables more effective detection of manipulated content and potential misuse across evolving model ecosystems. The framework's superiority in macro-F1 scores across multiple evaluation scenarios suggests practical viability.
The research highlights that efficient, analytics-based approaches may outperform conventional machine learning paradigms for this specific problem space. As attribution becomes commoditized across major platforms, such innovations directly influence detection accuracy and the resources required for content moderation at scale.
- →RidgeFT enables continuous identification of new language models without degrading performance on previously learned generators.
- →The framework uses closed-form analytics rather than exemplar replay, reducing memory requirements and computational overhead.
- →Consistent improvements in macro-F1 scores across domains suggest practical applicability for content authentication systems.
- →Feature-stable updates through covariance calibration provide a novel approach to the catastrophic forgetting problem in lifelong learning.
- →The method addresses growing needs for AI accountability and misuse detection as new language models rapidly proliferate.