y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#distributionally-robust-optimization News & Analysis

1 article tagged with #distributionally-robust-optimization. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AIBullisharXiv – CS AI · Apr 137/10
🧠

Distributionally Robust Token Optimization in RLHF

Researchers propose Distributionally Robust Token Optimization (DRTO), a method combining reinforcement learning from human feedback with robust optimization to improve large language model consistency across distribution shifts. The approach demonstrates 9.17% improvement on GSM8K and 2.49% on MathQA benchmarks, addressing LLM vulnerabilities to minor input variations.