SENTRY: Statistical Reliability Analysis of Vision Transformers Under Soft Errors
Researchers present SENTRY, a statistical fault injection framework that efficiently evaluates Vision Transformers' reliability against soft errors in safety-critical applications. The method achieves formal reliability guarantees using finite-population sampling theory, reducing experimental costs by up to 10,700x while identifying critical vulnerabilities in normalization layers and IEEE-754 exponent bits.
SENTRY addresses a critical gap in AI safety research: as Vision Transformers proliferate in autonomous vehicles and medical imaging systems, their vulnerability to hardware-induced soft errors remains largely unquantified. Traditional exhaustive fault injection testing becomes computationally prohibitive with modern model scales, making the statistical approach transformative for reliability assessment.
The framework leverages finite-population sampling theory to provide formal guarantees—bounding failure rates within 1% at 99% confidence using only thousands of samples. This represents a paradigm shift from brute-force testing. The research reveals a striking reliability landscape: while only 3% of bit-flips cause failures, those failures typically result in catastrophic accuracy collapse rather than graceful degradation. This non-uniform vulnerability distribution is crucial for engineers designing hardened systems.
For the AI industry, this work has immediate practical implications. Safety-critical deployments of ViTs require certification and reliability assurance, particularly in regulated domains like automotive and healthcare. SENTRY enables rapid characterization of vulnerability hotspots—specifically normalization layers and critical floating-point exponent bits—allowing targeted hardening efforts. Organizations deploying ViTs can now benchmark reliability systematically rather than relying on best-effort assumptions.
The methodology's cost reduction enables broader adoption of reliability testing across model architectures and hardware platforms. As edge deployment of vision transformers accelerates, understanding fault propagation mechanisms becomes essential for system designers. The actionable insights regarding specific vulnerable components provide a foundation for developing mitigation strategies, whether through selective replication, precision management, or architectural modifications tailored to safety requirements.
- →SENTRY reduces ViT reliability testing costs by up to 10,700x using statistical sampling instead of exhaustive fault injection
- →Only 3% of floating-point bit-flips cause failures, but these events typically trigger complete accuracy collapse
- →Normalization layers and IEEE-754 exponent bits are identified as critical vulnerability hotspots in Vision Transformers
- →Statistical framework provides formal reliability guarantees with 1% margin at 99% confidence using limited samples
- →Findings enable targeted hardening strategies for safety-critical ViT deployments in autonomous systems and medical imaging