AINeutralarXiv – CS AI · 10h ago6/10
🧠
EvoPref: Multi-Objective Evolutionary Optimization Discovers Diverse LLM Alignments Beyond Gradient Descent
Researchers introduce EvoPref, a multi-objective evolutionary algorithm that optimizes LLM alignment across multiple objectives using population-based methods rather than traditional gradient descent. The approach demonstrates 18% improvement in preference coverage and 47% reduction in preference collapse while maintaining competitive alignment quality compared to gradient-based methods like ORPO.