y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#llm-robustness News & Analysis

4 articles tagged with #llm-robustness. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles
AIBearisharXiv โ€“ CS AI ยท 4d ago7/10
๐Ÿง 

One Token Away from Collapse: The Fragility of Instruction-Tuned Helpfulness

Researchers demonstrate that instruction-tuned large language models suffer severe performance degradation when subject to simple lexical constraints like banning a single punctuation mark or common word, losing 14-48% of response quality. This fragility stems from a planning failure where models couple task competence to narrow surface-form templates, affecting both open-weight and commercially deployed closed-weight models like GPT-4o-mini.

๐Ÿง  GPT-4
AIBullisharXiv โ€“ CS AI ยท 6d ago7/10
๐Ÿง 

Distributionally Robust Token Optimization in RLHF

Researchers propose Distributionally Robust Token Optimization (DRTO), a method combining reinforcement learning from human feedback with robust optimization to improve large language model consistency across distribution shifts. The approach demonstrates 9.17% improvement on GSM8K and 2.49% on MathQA benchmarks, addressing LLM vulnerabilities to minor input variations.

AIBearisharXiv โ€“ CS AI ยท Apr 106/10
๐Ÿง 

MedDialBench: Benchmarking LLM Diagnostic Robustness under Parametric Adversarial Patient Behaviors

Researchers introduce MedDialBench, a comprehensive benchmark testing how large language models maintain diagnostic accuracy when patients exhibit adversarial behaviors across five dimensions. The study reveals that fabricating symptoms causes 1.7-3.4x larger accuracy drops than withholding information, with worst-case performance degradation ranging from 38.8 to 54.1 percentage points across tested models.