AINeutralarXiv – CS AI · 18h ago6/10
🧠
Multi-Turn Evaluation of Deep Research Agents Under Process-Level Feedback
Researchers evaluate whether deep research agents (DRAs) can improve iteratively through feedback, finding that self-reflection yields negligible gains while single rounds of process-level feedback produce substantial improvements—but these gains don't compound over multiple turns due to regression on previously satisfied criteria.