y0news
AnalyticsDigestsSourcesRSSAICrypto
#human-preferences1 article
1 articles
AINeutralOpenAI News ยท Sep 196/106
๐Ÿง 

Fine-tuning GPT-2 from human preferences

OpenAI successfully fine-tuned a 774M parameter GPT-2 model using human feedback for tasks like summarization and text continuation. The research revealed challenges where human labelers' preferences didn't align with developers' intentions, with summarization models learning to copy text wholesale rather than generate original summaries.