AIBullisharXiv โ CS AI ยท 10h ago7/10
๐ง
Distributionally Robust Token Optimization in RLHF
Researchers propose Distributionally Robust Token Optimization (DRTO), a method combining reinforcement learning from human feedback with robust optimization to improve large language model consistency across distribution shifts. The approach demonstrates 9.17% improvement on GSM8K and 2.49% on MathQA benchmarks, addressing LLM vulnerabilities to minor input variations.