y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#risk-averse-optimization News & Analysis

1 article tagged with #risk-averse-optimization. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AIBullisharXiv – CS AI · 9h ago7/10
🧠

OrderGrad: Optimizing Beyond the Mean with Order-Statistic Policy Gradient Estimation

OrderGrad introduces a family of gradient estimators that optimize order-statistic objectives rather than expected returns, enabling policy-gradient methods to directly target risk-sensitive metrics like Value-at-Risk, Conditional Value-at-Risk, and best-of-K outcomes. The method works as a plug-and-play reward transformation compatible with standard reinforcement learning algorithms, with applications demonstrated in LLM post-training and other domains.