AIBearisharXiv – CS AI · 3d ago7/10
🧠Researchers introduce τ-Rec, a new benchmark for evaluating conversational AI recommender systems that replaces subjective LLM-based judging with verifiable, measurable rewards. Testing across nine model configurations reveals a critical reliability gap, with even top-performing models achieving only ~57% accuracy on single-attempt tasks, exposing significant limitations in current agentic AI deployment.
🧠 GPT-5🧠 Claude🧠 Sonnet
AIBullisharXiv – CS AI · Jun 27/10
🧠FlowTime introduces a novel 'Continuous Generative Regression' paradigm for watch time prediction in short-video recommender systems, addressing limitations of existing regression, ordinal, and discrete generative approaches. The method uses flow-based personalized priors within a one-step generative VAE to model multimodal user-item interaction patterns while reducing inference latency, demonstrating superior performance in both offline experiments and A/B testing.
AIBearisharXiv – CS AI · May 17/10
🧠Researchers identify four systematic bias channels in transformer-based AI recommenders: positional bias favoring recent events, popularity amplification creating echo chambers, latent driver bias from unobserved user motivations, and synthetic data bias from retraining on AI-generated logs. These mechanism-level risks can distort user exposure and choice at scale, potentially reducing reliability despite strong offline performance metrics.
AINeutralarXiv – CS AI · Mar 267/10
🧠Researchers challenge the assumption that fair model representations in recommender systems translate to fair recommendations. Their study reveals that while optimizing for fair representations improves recommendation parity, representation-level evaluation is not a reliable proxy for measuring actual fairness in recommendations when comparing models.
🏢 Meta
AINeutralarXiv – CS AI · Mar 56/10
🧠Researchers introduce SafeCRS, a safety-aware training framework for LLM-based conversational recommender systems that addresses personalized safety vulnerabilities. The system reduces safety violation rates by up to 96.5% while maintaining recommendation quality by respecting individual user constraints like trauma triggers and phobias.
AIBullisharXiv – CS AI · Mar 57/10
🧠Researchers developed HAP (Heterogeneity-Aware Adaptive Pre-ranking), a new framework for recommender systems that addresses gradient conflicts in training by separating easy and hard samples. The system has been deployed in Toutiao's production environment for 9 months, achieving 0.4% improvement in user engagement without additional computational costs.
AIBullisharXiv – CS AI · Mar 46/103
🧠Researchers propose AlphaFree, a novel recommender system that eliminates traditional dependencies on user embeddings, raw IDs, and graph neural networks. The system achieves up to 40% performance improvements while reducing GPU memory usage by up to 69% through language representations and contrastive learning.
AIBullisharXiv – CS AI · 4d ago6/10
🧠Researchers have developed MedicalRec, a transformer-based recommender system that identifies optimal deep learning models for medical image classification tasks without requiring retraining. The system leverages a new dataset (MedicalRec-Bench) containing over 5,000 model performance records across five medical imaging domains, achieving a 75.5% HitRate@100 and addressing the computational waste inherent in trial-and-error model selection.
AINeutralarXiv – CS AI · 4d ago5/10
🧠Researchers present MO-PQUCB, a novel algorithm for personalized multi-objective decision-making that combines conversational queries with bandit feedback to learn user preferences more efficiently. The method uses a Plackett-Luce choice model and shift-invariant regularization to overcome fundamental learning barriers, demonstrating improved regret scaling and robustness to corrupted preference signals compared to existing approaches.
AINeutralarXiv – CS AI · 5d ago6/10
🧠Researchers demonstrate a carbon-aware recommendation system for e-commerce that infers missing Product Carbon Footprint data and applies post-hoc re-ranking to balance user engagement against sustainability. The framework achieves substantial carbon reductions with minimal engagement cost across multiple product categories and recommendation models.
AINeutralarXiv – CS AI · 5d ago6/10
🧠Researchers demonstrate that semantic ID-based generative recommendation systems hit significant scaling bottlenecks, while large language models used directly as recommenders show superior scaling properties and up to 20% performance improvements. This challenges current approaches in generative recommendation and suggests LLM-based systems represent a more promising path forward for recommendation foundation models.
AIBullisharXiv – CS AI · Jun 16/10
🧠Researchers propose HERec, a hyperbolic-geometry-based recommender system framework that balances content exploration and exploitation while mitigating information cocoons. The system combines semantic-enhanced hierarchical mechanisms with automatic clustering to improve diversity by 11.39% and utility by 5.49% over existing approaches.
AINeutralarXiv – CS AI · May 296/10
🧠Researchers propose integrating explicit user feedback (comments, reviews, verbal text) into Large Language Model-based recommendation systems to better align with actual user preferences. The approach addresses limitations in traditional recommender systems that rely solely on implicit signals like clicks and purchases, potentially reducing filter bubbles and improving transparency.
AINeutralarXiv – CS AI · May 286/10
🧠Researchers compared two conditioning approaches in educational recommendation systems: context-based (using current student questions) versus memory-based (using persistent learner history). Memory-based conditioning produced more personalized, history-dependent behavior while context-based approaches showed stronger immediate responsiveness, suggesting that embedding-based similarity metrics alone are insufficient for capturing true personalization effects.
AINeutralarXiv – CS AI · May 276/10
🧠Researchers propose novel algorithms (LDB-DF and NDB-DF) for contextual dueling bandits that handle delayed feedback—a critical real-world constraint in recommender systems and LLM alignment. The breakthrough involves an Inverse Probability Weighting mechanism that eliminates bias from delayed observations, achieving theoretical regret bounds of O(d√T) for linear settings.
AINeutralarXiv – CS AI · May 95/10
🧠Researchers propose UAT-MC, a new defense mechanism for multimodal recommender systems that addresses cross-modal gradient misalignment in evasion-based promotion attacks. The approach synchronizes visual and textual perturbations through coordinated adversarial training, improving robustness while maintaining recommendation quality.
AIBullisharXiv – CS AI · Mar 34/103
🧠Researchers propose I-LLMRec, a new method for AI recommender systems that uses images instead of lengthy text descriptions to represent items, reducing computational token usage while maintaining recommendation quality. The approach leverages the information overlap between images and descriptions to create more efficient and robust LLM-based recommendation systems.
AINeutralarXiv – CS AI · Mar 34/103
🧠Researchers propose Rejuvenated Cross-Entropy for Knowledge Distillation (RCE-KD) to improve knowledge distillation in recommender systems by addressing limitations of Cross-Entropy loss when distilling teacher model rankings. The method splits teacher's top items into subsets and uses adaptive sampling to better align with theoretical assumptions.
AINeutralarXiv – CS AI · Feb 274/103
🧠PuppetChat is a research prototype messaging system that uses AI-powered recommendations and personalized micronarratives to enhance intimate communication between close partners and friends. A 10-day field study with 11 dyads showed the system improved social presence, self-disclosure, and relationship continuity through more expressive bidirectional interactions.
AINeutralarXiv – CS AI · Mar 24/105
🧠Researchers conducted interviews with 11 practitioners at major tech companies to study how fairness considerations are integrated into recommender system workflows. The study identified key challenges including defining fairness in RS contexts, balancing stakeholder interests, and facilitating cross-team communication between technical, legal, and fairness teams.