y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#dpo-alignment News & Analysis

1 article tagged with #dpo-alignment. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AIBullisharXiv – CS AI · 15h ago7/10
🧠

Curriculum Learning for Safety Alignment

Researchers propose Staged-Competence, a curriculum learning framework that enhances Direct Preference Optimisation (DPO) for AI safety alignment. The method reduces out-of-distribution harmful responses by 16% and jailbreak success rates by 20% while maintaining model capabilities, achieving baseline safety with 25% less training data.