AINeutralarXiv โ CS AI ยท Mar 56/10
๐ง
Towards Personalized Deep Research: Benchmarks and Evaluations
Researchers introduce PDR-Bench, the first benchmark for evaluating personalization in Deep Research Agents (DRAs), featuring 250 realistic user-task queries across 10 domains. The benchmark uses a new PQR Evaluation Framework to measure personalization alignment, content quality, and factual reliability in AI research assistants.