y0news
AnalyticsDigestsSourcesRSSAICrypto
#trust-mechanisms1 article
1 articles
AINeutralarXiv โ€“ CS AI ยท 8h ago4/10
๐Ÿง 

Learning When to Trust in Contextual Bandits

Researchers propose CESA-LinUCB, a new approach to robust reinforcement learning that addresses 'Contextual Sycophancy' where evaluators are truthful in normal situations but biased in critical contexts. The method learns trust boundaries for each evaluator and achieves sublinear regret even when no evaluator is globally reliable.