AINeutralarXiv β CS AI Β· 14h ago6/10
π§
Principles Do Not Apply Themselves: A Hermeneutic Perspective on AI Alignment
A new arXiv paper argues that AI alignment cannot rely solely on stated principles because their real-world application requires contextual judgment and interpretation. The research shows that a significant portion of preference-labeling data involves principle conflicts or indifference, meaning principles alone cannot determine decisionsβand these interpretive choices often emerge only during model deployment rather than in training data.