y0news
← Feed
Back to feed
🧠 AI🟢 BullishImportance 6/10

APPA: Adaptive Preference Pluralistic Alignment for Fair Federated RLHF of LLMs

arXiv – CS AI|Mahmoud Srewa, Tianyu Zhao, Salma Elmalaki|
🤖AI Summary

Researchers propose APPA, a new framework for aligning large language models with diverse human preferences in federated learning environments. The method dynamically reweights group-level rewards to improve fairness, achieving up to 28% better alignment for underperforming groups while maintaining overall model performance.

Key Takeaways
  • APPA addresses fairness issues in federated reinforcement learning from human feedback by dynamically reweighting group rewards based on historical performance.
  • The framework improves worst-performing group alignment by up to 28% compared to average aggregation methods.
  • Testing across three model families (Gemma 2 2B, Llama 3.2 3B, Qwen3 0.6B) demonstrates consistent improvements in fairness-alignment trade-offs.
  • The approach operates without requiring access to raw preference data, making it suitable for privacy-preserving federated learning scenarios.
  • APPA outperforms both average-based and min-based aggregation methods in balancing overall alignment with fairness across diverse user groups.
Mentioned in AI
Companies
Meta
Models
LlamaMeta
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles