WavesFM: Hierarchical Representation Learning for Longitudinal Wearable Sensor Waveforms
Researchers introduce WavesFM, a foundation model using hierarchical self-supervised learning to extract health insights from continuous wearable sensor data. Trained on 6.8M hours of physiological recordings from 324k individuals, the model captures both local waveform patterns and long-term behavioral dynamics, demonstrating strong performance across 58 health-related prediction tasks.
WavesFM addresses a critical gap in physiological signal processing by combining two previously siloed approaches: detailed morphological analysis of raw waveforms and temporal modeling of behavioral patterns. The hierarchical architecture—first learning segment-level embeddings, then modeling their temporal sequences—elegantly solves the computational bottleneck that has constrained prior work on multi-day wearable data.
The research emerges from growing recognition that existing methods fail to capture the full picture of human health. Traditional self-supervised approaches either extract rich local features while discarding longitudinal structure, or operate on coarse hand-crafted metrics that strip away subtle predictive signals. WavesFM bridges this by pretraining on massive scales: over 6.8M hours of raw sensor data from hundreds of thousands of individuals, enabling the model to learn generalizable representations of physiological dynamics without expensive ground-truth labels.
The scale and diversity of downstream performance—58 tasks spanning demographics, lifestyle factors, health conditions, and medications—suggests WavesFM could become a practical tool for clinical research and consumer health applications. The model's ability to infer health states from raw waveforms has implications for early disease detection, medication adherence monitoring, and understanding circadian variations in health metrics.
Looking ahead, the key question is whether such models transition from research to deployed systems. Integration with wearable manufacturers, clinical validation studies, and regulatory pathways will determine real-world impact. The foundation model approach also raises questions about data privacy and fair representation across demographic groups in the training set.
- →Two-stage SSL framework decomposes high-resolution waveform learning into manageable local and temporal components
- →Trained on 6.8M hours of physiological data from 324k individuals without requiring manual labels
- →Demonstrates strong generalization across 58 diverse health prediction tasks in single pretrained model
- →Hierarchical approach overcomes computational complexity that previously limited analysis of weeks-long sensor recordings
- →Foundation model architecture enables potential deployment in clinical settings and consumer health applications