AINeutralarXiv – CS AI · 10h ago6/10
🧠
When (and How) to Trust the Expert: Diagnosing Query-Time Expert-Guided Reinforcement Learning
Researchers conduct a comprehensive benchmarking study of expert-guided reinforcement learning methods, revealing three critical failure modes that single-paper evaluations miss. They propose a decision rule based on pre-training observables to guide method selection, introducing EDGE as a new design point that exposes exploitable architectural dimensions.