AINeutralarXiv โ CS AI ยท 6h ago2
๐ง
What Is the Geometry of the Alignment Tax?
Researchers present a formal geometric theory for quantifying the alignment tax - the tradeoff between AI safety and capability performance. They derive mathematical frameworks showing how safety-capability conflicts can be measured using angles between representation subspaces and provide scaling laws for how these tradeoffs evolve with model size.