Ocean4Rec: Offline LLM-Derived OCEAN Profiles for Request-Time VOD Reranking
Ocean4Rec presents a novel approach to video-on-demand recommendation by using LLMs offline to generate OCEAN personality profiles for content items, then performing request-time reranking without real-time model calls. The system demonstrates significant NDCG improvements (7.6-61.5%) on Samsung Smart TV data while maintaining deployment simplicity and predictable latency for production services.
Ocean4Rec addresses a fundamental tension in deploying LLM-powered recommenders at scale: using LLMs for richer content understanding while avoiding the operational complexity and latency overhead of request-time inference. The research demonstrates that personality-based content profiling can be effectively decoupled into an offline batch phase and a lightweight online reranking phase, preserving the deployment advantages of traditional systems.
The approach reflects broader industry trends in production ML systems, where practitioners seek to balance model sophistication with operational simplicity. Rather than calling an LLM for every user request, Ocean4Rec precomputes item embeddings in a five-dimensional OCEAN space (based on the Big Five personality traits) and aggregates user preferences through time-decayed interactions. This architectural choice eliminates throughput planning complexity, tail-latency unpredictability, and capacity contention issues endemic to request-time inference.
The offline evaluation results on Samsung Smart TV data show meaningful improvements, particularly for the LightGCN generator (61.5% NDCG gain, 67.3% HR gain), though the authors appropriately caveat that exact-item replay metrics may underestimate real-world impact given recency's strength as an industrial baseline. The results validate that compact semantic profiles derived from LLMs can meaningfully improve ranking when integrated with simpler online mechanisms.
Looking forward, the core insight—that LLM value can be materialized offline for latency-sensitive services—has applications beyond VOD, including e-commerce and feed ranking. The research suggests a scalable path for LLM integration in high-volume systems without sacrificing operational predictability.
- →Ocean4Rec decouples LLM usage into offline profiling and online reranking, eliminating request-time inference overhead
- →OCEAN personality profiles improve NDCG@20 by 7.6-61.5% depending on the base recommender architecture on Samsung Smart TV data
- →The architecture preserves deployment simplicity and latency predictability critical for production video services
- →Time-decayed user preference aggregation in the same five-dimensional space enables lightweight request-time ranking without model calls
- →Results suggest offline LLM-derived features represent a practical path for integrating richer content understanding into latency-sensitive recommendation systems