AINeutralarXiv – CS AI · 14h ago6/10
🧠
Quotient DAGs for Off-Policy Evaluation:Forward-Flow Importance Sampling and Exact Slate Propensities
Researchers introduce Quotient DAGs, a novel framework for off-policy evaluation that addresses variance issues in importance sampling by recognizing when generation process details are irrelevant to evaluation targets. The method computes exact unordered slate propensities efficiently through Forward-DP, a dynamic programming approach that avoids factorial enumeration, enabling practical evaluation for autoregressive slate recommendation systems.