AINeutralarXiv – CS AI · 5h ago6/10
🧠
VALUEFLOW: Toward Pluralistic and Steerable Value-based Alignment in Large Language Models
Researchers introduce VALUEFLOW, a comprehensive framework for aligning Large Language Models with diverse human values through hierarchical extraction, calibrated intensity evaluation, and steerable control mechanisms. The system addresses fundamental limitations in existing preference-based alignment approaches by enabling precise, multi-theory value alignment at controlled intensities across different models.