Mind the Gap: Bridging Behavioral Silos with LLMs in Multi-Vertical Recommendations
Researchers propose a novel framework using Large Language Models and Retrieval-Augmented Generation to address the cold-start problem in multi-vertical e-commerce platforms by transferring behavioral knowledge from data-rich verticals like restaurants to emerging categories like grocery and retail. The approach synthesizes hierarchical taxonomic features from user order histories and integrates them into a Multi-Task Learning ranking model, demonstrating improved personalization in production environments.
This research addresses a fundamental challenge in modern e-commerce platforms: enabling personalization for newer product categories that lack sufficient historical user data. DoorDash and similar multi-vertical marketplaces struggle with sparse behavioral signals in emerging verticals, limiting recommendation quality and user engagement. The team's solution leverages LLMs as a knowledge-transfer mechanism, using generative inference to synthesize high-dimensional user preference features from data-rich verticals where substantial ordering history exists.
The framework's innovation lies in its hierarchical RAG pipeline, which extracts multi-level taxonomic features encoding both long-term cross-vertical affinities and short-term purchase intent. This represents a meaningful evolution in recommendation systems, moving beyond traditional collaborative filtering to incorporate semantic understanding of user behavior patterns across business silos.
For e-commerce platforms and their investors, this approach has significant operational implications. Successfully bridging the cold-start gap accelerates monetization of emerging verticals while improving user experience. The production validation through both offline and online evaluation suggests the method delivers measurable business impact, not merely theoretical advancement. Platform operators can now onboard and scale new categories faster, capturing market opportunities in grocery and retail segments that previously suffered from weak personalization.
The broader implications extend to other multi-category marketplaces facing similar data distribution challenges. The technique demonstrates how LLMs can serve as data augmentation and knowledge-transfer tools in production systems, creating competitive advantages for platforms capable of implementing such infrastructure. Investors should monitor whether major e-commerce and marketplace operators adopt similar approaches to improve marginal category performance.
- βLLMs enable knowledge transfer from data-rich to data-sparse product verticals, solving cold-start problems in multi-category marketplaces.
- βHierarchical RAG pipelines extract multi-level user preference features that encode both long-term affinities and short-term purchase intent.
- βProduction Multi-Task Learning models integrate generated features to significantly improve personalization and engagement metrics.
- βThe approach delivers measurable business value in real-world e-commerce environments, validated through online experimentation.
- βThis framework accelerates monetization of emerging business verticals by enabling faster category scaling with better recommendations.