Where Rectified Flows Leak: Characterising Membership Signals Along the Interpolation Path
Researchers demonstrate that Rectified Flows, a generative model architecture increasingly deployed in production systems, leak membership information about training data along their interpolation path in a quantifiable, bell-shaped pattern. This vulnerability enables practical membership inference attacks that can distinguish training set members from non-members, raising significant privacy and copyright concerns for deployed generative AI systems.
This research identifies a previously uncharacterized privacy vulnerability in Rectified Flows, a generative model architecture gaining adoption in commercial AI systems. The study reveals that training data leaves exploitable traces along the model's interpolation path—the mathematical trajectory from random noise to generated outputs—that remain invisible in normal model outputs yet encode membership signals. The vulnerability manifests as a bell-shaped curve peaking at a specific interpolation point that researchers can predict mathematically under Gaussian assumptions.
The findings emerge from growing scrutiny of how generative models retain training data information. While previous research focused on verbatim reproduction and obvious memorization, this work demonstrates subtler encoding mechanisms that evade standard monitoring. The universality of the bell-shaped signal across different modalities (audio and images) suggests the vulnerability is fundamental to how Rectified Flows process information rather than an implementation artifact.
For deployed systems, this creates meaningful risk. Organizations using Rectified Flows for sensitive applications face potential training data extraction attacks, with implications for copyright liability and privacy violations. The closed-form prediction of maximum signal location enables attackers to focus computational effort efficiently. This particularly threatens applications handling proprietary training data or personal information.
The broader implications extend beyond individual model security. As generative models become infrastructure for numerous industries, understanding these interpolation-path vulnerabilities becomes critical for compliance with privacy regulations like GDPR. Future work likely focuses on developing defenses—such as differential privacy mechanisms—specifically designed to disrupt the interpolation-path signal while maintaining generation quality. The research establishes a new dimension for evaluating generative model safety and privacy compliance.
- →Rectified Flows leak training data membership signals along interpolation paths that follow a universal bell-shaped curve independent of model type.
- →The peak location of membership signals can be mathematically predicted in closed form under Gaussian assumptions, enabling targeted attacks.
- →Membership inference attacks using this mechanism can distinguish training set members from non-members with exploitable accuracy.
- →Privacy vulnerabilities in generative models persist in intermediate representations even when final outputs appear safe and non-memorized.
- →The findings apply across multiple modalities (audio, images), suggesting the vulnerability is a fundamental property of Rectified Flow architecture.