AINeutralarXiv โ CS AI ยท 5h ago
๐ง
Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions
Researchers have introduced RealPref, a new benchmark for evaluating how well Large Language Models follow user preferences in long-term personalized interactions. The study reveals that LLM performance significantly degrades with longer contexts and more implicit preference expressions, highlighting challenges in developing user-aware AI assistants.