AIBullisharXiv โ CS AI ยท 15h ago7/10
๐ง
SAHOO: Safeguarded Alignment for High-Order Optimization Objectives in Recursive Self-Improvement
Researchers introduce SAHOO, a framework to prevent alignment drift in AI systems that recursively self-improve by monitoring goal changes, preserving constraints, and quantifying regression risks. The system achieved 18.3% improvement in code generation and 16.8% in reasoning tasks while maintaining safety constraints across 189 test scenarios.