Safety Under Scaffolding: How Evaluation Conditions Shape Measured Safety
A large-scale study of 62,808 AI safety evaluations across six frontier models reveals that deployment scaffolding architectures can significantly impact measured safety, with map-reduce scaffolding degrading safety performance. The research found that evaluation format (multiple-choice vs open-ended) affects safety scores more than scaffold architecture itself, and safety rankings vary dramatically across different models and configurations.