y0news
← Feed
←Back to feed
🧠 AI🟒 BullishImportance 6/10

APPA: Adaptive Preference Pluralistic Alignment for Fair Federated RLHF of LLMs

arXiv – CS AI|Mahmoud Srewa, Tianyu Zhao, Salma Elmalaki|
πŸ€–AI Summary

Researchers propose APPA, a new framework for aligning large language models with diverse human preferences in federated learning environments. The method dynamically reweights group-level rewards to improve fairness, achieving up to 28% better alignment for underperforming groups while maintaining overall model performance.

Key Takeaways
  • β†’APPA addresses fairness issues in federated reinforcement learning from human feedback by dynamically reweighting group rewards based on historical performance.
  • β†’The framework improves worst-performing group alignment by up to 28% compared to average aggregation methods.
  • β†’Testing across three model families (Gemma 2 2B, Llama 3.2 3B, Qwen3 0.6B) demonstrates consistent improvements in fairness-alignment trade-offs.
  • β†’The approach operates without requiring access to raw preference data, making it suitable for privacy-preserving federated learning scenarios.
  • β†’APPA outperforms both average-based and min-based aggregation methods in balancing overall alignment with fairness across diverse user groups.
Mentioned in AI
Companies
Meta→
Models
LlamaMeta
Read Original β†’via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains β€” you keep full control of your keys.
Connect Wallet to AI β†’How it works
Related Articles