APPA: Adaptive Preference Pluralistic Alignment for Fair Federated RLHF of LLMs
Researchers propose APPA, a new framework for aligning large language models with diverse human preferences in federated learning environments. The method dynamically reweights group-level rewards to improve fairness, achieving up to 28% better alignment for underperforming groups while maintaining overall model performance.