FlowTime: Towards Continuous Generative Watch Time Prediction via Flow-based Personalized Priors
FlowTime introduces a novel 'Continuous Generative Regression' paradigm for watch time prediction in short-video recommender systems, addressing limitations of existing regression, ordinal, and discrete generative approaches. The method uses flow-based personalized priors within a one-step generative VAE to model multimodal user-item interaction patterns while reducing inference latency, demonstrating superior performance in both offline experiments and A/B testing.
FlowTime represents a meaningful advancement in recommendation system optimization, tackling a core challenge in video platforms: predicting how long users will engage with content. The research identifies that existing watch time prediction methods fail to capture the heterogeneity in user behavior—users with identical interests often exhibit different viewing patterns based on individual habits and contexts. This gap matters because accurate watch time prediction directly influences content ranking, user retention, and platform monetization.
The technical contribution centers on reframing watch time prediction as a continuous generative problem rather than discrete classification or simple regression. By employing flow-based neural networks to create personalized priors conditioned on user history, FlowTime avoids the computational overhead of iterative denoising while maintaining expressivity. The one-step generative VAE design specifically addresses the latency constraints that plague existing discrete generative methods—critical for real-time recommendation scenarios serving millions of users.
For the recommendation system industry, this work establishes new benchmarking standards through TimeRec, an open-source library addressing a historically fragmented evaluation landscape. The approach's demonstrated superiority in A/B testing on actual platforms suggests practical viability rather than theoretical merit alone. This methodological shift could influence how major platforms (YouTube, TikTok, Instagram Reels) approach engagement prediction, potentially improving content discovery accuracy and user experience quality.
Future development hinges on whether the flow-based personalization mechanism generalizes across diverse user populations and content categories. Integration with existing recommendation architectures and computational efficiency at scale remain open questions for broader adoption.
- →FlowTime proposes a fourth paradigm (Continuous Generative Regression) for watch time prediction, addressing mean-collapse and quantization limitations of prior approaches.
- →Flow-based personalized priors enable adaptive modeling of multimodal user-item interaction patterns conditioned on individual user history.
- →One-step generative VAE design eliminates iterative denoising latency while maintaining continuous latent space expressivity for real-time systems.
- →TimeRec library establishes the first standardized open-source benchmarking framework for watch time prediction with novel personalization metrics.
- →A/B testing validates FlowTime's superior performance on production platforms, signaling practical viability beyond academic benchmarks.