AINeutralarXiv – CS AI · 6h ago6/10
🧠
TriAlign: Towards Universal Truth Consistency in Personalized LLM Alignment
Researchers introduce TriAlign, a machine learning framework that addresses fairness issues in personalized large language models by ensuring universal truths remain consistent across different social groups. The method balances accuracy, fairness, and personalization through multi-agent reinforcement learning, reducing disparities in objective task performance while maintaining user preference adaptation.