Same Brain, Different Prediction: How Preprocessing Choices Undermine EEG Decoding Reliability
Researchers demonstrate that EEG-based deep learning models produce unstable predictions when preprocessing pipelines change, with up to 42% of predictions flipping across different preprocessing choices. The study introduces three tools—Walsh-Hadamard decomposition, Preprocessing Uncertainty metrics, and a regularization approach—to measure and mitigate this instability, revealing a critical reliability gap in brain-computer interface systems.
This research addresses a fundamental reproducibility problem in EEG-based machine learning that has largely gone unexamined in published work. Most deep learning studies train and validate models on a single, often unreported preprocessing pipeline, creating an illusion of reliability that collapses when different data cleaning or filtering choices are applied. The finding that 42% of trial-level predictions flip demonstrates that preprocessing choices function as a major source of model instability, one that existing uncertainty quantification methods fail to capture because they assume a fixed pipeline.
The work's significance extends beyond academic rigor. Brain-computer interfaces and clinical neuroscience applications depend on prediction reliability for diagnostic accuracy and patient safety. When predictions are contingent on undocumented preprocessing choices, real-world deployment becomes hazardous—different labs, clinicians, or device manufacturers applying slightly different filters could produce contradictory results for the same patient data.
The three proposed solutions directly address this gap. The Walsh-Hadamard decomposition reveals that the 2^7 preprocessing space exhibits near-additive sensitivity, enabling systematic optimization rather than brute-force search. Preprocessing Uncertainty (PU) provides a new diagnostic metric independent of model confidence, identifying trials where predictions are inherently fragile. Normalized Adaptive PGI offers a practical regularization strategy that exploits preprocessing's compositional structure.
For developers and researchers, this work establishes preprocessing stability as a required validation benchmark. Clinical adoption of EEG-based systems may increasingly demand transparency about preprocessing dependencies and demonstrated robustness across reasonable pipeline variations. The research trajectory suggests future standards may mandate preprocessing uncertainty reporting alongside traditional confidence metrics.
- →EEG predictions flip in up to 42% of cases when preprocessing pipelines change, indicating severe instability masked by standard evaluation practices.
- →Existing uncertainty quantification methods fail to capture preprocessing-induced variability because they condition on a fixed pipeline.
- →Walsh-Hadamard decomposition reveals near-additive sensitivity across the 2^7 preprocessing intervention space, enabling efficient optimization.
- →Preprocessing Uncertainty (PU) provides a complementary diagnostic metric that identifies fragile predictions independent of model confidence.
- →Clinical and real-world EEG applications require documented preprocessing robustness, suggesting future standards will mandate transparency and stability validation.