y0news
AnalyticsDigestsSourcesRSSAICrypto
#reward-systems1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 2d ago6/10
๐Ÿง 

Aligning Large Language Models with Searcher Preferences

Researchers introduce SearchLLM, the first large language model designed for open-ended generative search, featuring a hierarchical reward system that balances safety constraints with user alignment. The model was deployed on RedNote's AI search platform, showing significant improvements in user engagement with a 1.03% increase in Valid Consumption Rate and 2.81% reduction in Re-search Rate.