#preprocessing News & Analysis

4 articles tagged with #preprocessing. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles

AINeutralarXiv – CS AI · May 116/10

🧠

Same Brain, Different Prediction: How Preprocessing Choices Undermine EEG Decoding Reliability

Researchers demonstrate that EEG-based deep learning models produce unstable predictions when preprocessing pipelines change, with up to 42% of predictions flipping across different preprocessing choices. The study introduces three tools—Walsh-Hadamard decomposition, Preprocessing Uncertainty metrics, and a regularization approach—to measure and mitigate this instability, revealing a critical reliability gap in brain-computer interface systems.

AINeutralarXiv – CS AI · Apr 146/10

🧠

A Triadic Suffix Tokenization Scheme for Numerical Reasoning

Researchers propose Triadic Suffix Tokenization (TST), a novel tokenization scheme that addresses how large language models process numbers by fragmenting digits into three-digit groups with explicit magnitude markers. The method aims to improve arithmetic and scientific reasoning in LLMs by preserving decimal structure and positional information, with two implementation variants offering scalability across 33 orders of magnitude.

AINeutralarXiv – CS AI · Feb 274/106

🧠

Scattering Transform for Auditory Attention Decoding

Researchers propose using scattering transform as a preprocessing method for EEG-based auditory attention decoding to solve the cocktail party problem in hearing aids. The two-layer scattering transform showed significant performance improvements on subject-related classification tasks, particularly on the KU Leuven dataset when compared to traditional preprocessing methods.

AINeutralarXiv – CS AI · Mar 34/104

🧠

USE: Uncertainty Structure Estimation for Robust Semi-Supervised Learning

Researchers introduce Uncertainty Structure Estimation (USE), a new preprocessing method for semi-supervised learning that improves model reliability by filtering out low-quality unlabeled data. The approach uses entropy scores and statistical thresholds to identify and remove out-of-distribution samples before training, demonstrating consistent accuracy improvements across imaging and NLP tasks.

$NEAR