Stanford Sleep Bench: Evaluating Polysomnography Pre-training Methods for Sleep Foundation Models

Summary

This paper introduces Stanford Sleep Bench, a large-scale Polysomnography (PSG) dataset and benchmark designed to overcome limitations in developing sleep foundation models. It systematically evaluates self-supervised representation learning (SSRL) pre-training methods across various sleep-related and clinical prediction tasks, revealing that contrastive learning significantly outperforms other approaches for complex mortality and disease prediction.

Medical Relevance

This research provides a critical foundation for developing advanced AI tools to analyze complex PSG data, potentially enabling more accurate and early diagnosis, prognosis, and risk stratification for sleep disorders and associated systemic diseases.

AI Health Application

The AI application involves using self-supervised representation learning (SSRL) to pre-train foundation models on Polysomnography (PSG) data. These AI models are then applied to automate and enhance various critical medical tasks, including accurate sleep staging, diagnosis of sleep disorders like apnea, estimation of biological age, and the prediction of various clinical diseases and patient mortality based on sleep patterns and physiological signals.

Key Points

  • Introduces Stanford Sleep Bench, a large-scale PSG dataset comprising 17,467 recordings (over 163,000 hours) from a major sleep clinic.
  • The benchmark includes canonical sleep tasks (sleep staging, apnea diagnosis, age estimation) and 13 complex clinical disease/mortality prediction tasks.
  • Systematically evaluates various Self-Supervised Representation Learning (SSRL) pre-training methods.
  • Assesses downstream performance across four tasks: sleep staging, apnea diagnosis, age estimation, and disease and mortality prediction.
  • Multiple pre-training methods achieved comparable performance for sleep staging, apnea diagnosis, and age estimation.
  • Contrastive learning significantly outperforms other SSRL approaches for mortality and disease prediction tasks.
  • Contrastive learning also demonstrates faster convergence during the pre-training phase.
  • The dataset, pre-trained model weights, training pipelines, and evaluation code will be released to facilitate reproducibility and research advancement.

Methodology

The study involved creating Stanford Sleep Bench, a large-scale multimodal PSG dataset. Various self-supervised representation learning (SSRL) pre-training methods were systematically evaluated on this dataset. Downstream performance was then assessed across four key tasks: sleep staging, apnea diagnosis, age estimation, and disease/mortality prediction, comparing the efficacy and convergence speed of different SSRL approaches.

Key Findings

For established sleep analysis tasks (staging, apnea diagnosis, age estimation), multiple pre-training methods showed comparable performance. However, for the more complex and clinically impactful tasks of mortality and disease prediction, contrastive learning emerged as significantly superior to other SSRL approaches, also demonstrating faster convergence during pre-training.

Clinical Impact

This work has the potential to revolutionize sleep analysis by enabling the development of highly accurate AI models for early and precise prediction of sleep-related diseases and mortality. It offers clinicians powerful tools for risk assessment and personalized patient management, moving beyond traditional sleep scoring to leverage PSG data for broader health outcomes.

Limitations

The abstract does not explicitly detail limitations of the study itself. However, it addresses prior significant limitations in the field, namely the lack of a shared, diverse PSG dataset and benchmark, and the absence of a systematic evaluation of SSRL approaches for sleep foundation models, which Stanford Sleep Bench aims to resolve.

Future Directions

The authors plan to release the Stanford Sleep Bench dataset, pre-trained model weights, training pipelines, and evaluation code. This release aims to foster reproducibility, encourage further research, and accelerate the development of robust sleep foundation models within the scientific community.

Medical Domains

Sleep Medicine Neurology Pulmonology Cardiology (indirectly, via apnea and mortality links) Internal Medicine (for general disease/mortality)

Keywords

Polysomnography (PSG) Self-supervised Learning Foundation Models Sleep Analysis Disease Prediction Mortality Prediction Contrastive Learning Sleep Staging

Abstract

Polysomnography (PSG), the gold standard test for sleep analysis, generates vast amounts of multimodal clinical data, presenting an opportunity to leverage self-supervised representation learning (SSRL) for pre-training foundation models to enhance sleep analysis. However, progress in sleep foundation models is hindered by two key limitations: (1) the lack of a shared dataset and benchmark with diverse tasks for training and evaluation, and (2) the absence of a systematic evaluation of SSRL approaches across sleep-related tasks. To address these gaps, we introduce Stanford Sleep Bench, a large-scale PSG dataset comprising 17,467 recordings totaling over 163,000 hours from a major sleep clinic, including 13 clinical disease prediction tasks alongside canonical sleep-related tasks such as sleep staging, apnea diagnosis, and age estimation. We systematically evaluate SSRL pre-training methods on Stanford Sleep Bench, assessing downstream performance across four tasks: sleep staging, apnea diagnosis, age estimation, and disease and mortality prediction. Our results show that multiple pretraining methods achieve comparable performance for sleep staging, apnea diagnosis, and age estimation. However, for mortality and disease prediction, contrastive learning significantly outperforms other approaches while also converging faster during pretraining. To facilitate reproducibility and advance sleep research, we will release Stanford Sleep Bench along with pretrained model weights, training pipelines, and evaluation code.