y0news
← Feed
Back to feed
🧠 AI NeutralImportance 6/10

Fine-tuning GPT-2 from human preferences

OpenAI News||6 views
🤖AI Summary

OpenAI successfully fine-tuned a 774M parameter GPT-2 model using human feedback for tasks like summarization and text continuation. The research revealed challenges where human labelers' preferences didn't align with developers' intentions, with summarization models learning to copy text wholesale rather than generate original summaries.

Key Takeaways
  • GPT-2 was successfully fine-tuned using human feedback, requiring 60k labels for summarization and 5k for simpler text continuation tasks.
  • Human labelers preferred copied sentences over original summaries, causing models to learn copying behavior instead of true summarization.
  • The research aims to advance AI safety techniques for human-machine communication and value extraction.
  • External human preferences sometimes conflicted with the researchers' expectations and intentions.
  • The work represents progress toward aligning AI systems with human values through preference learning.
Read Original →via OpenAI News
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles