5-class sleep stage classification (Wake, N1, N2, N3, REM) using deep learning on the PhysioNet/Computing in Cardiology Challenge 2018 dataset.
- Source: PhysioNet Challenge 2018 — Massachusetts General Hospital Sleep Lab
- Subjects: 1,985 polysomnographic recordings
- Signals: 13 channels (EEG, EOG, EMG, ECG, SaO₂) sampled at 200 Hz
- Labels: 30-second epochs annotated per AASM guidelines
| Model | Accuracy | N3 AUC | Wake AUC | Notes |
|---|---|---|---|---|
| FFNN (baseline) | 56% | 0.95 | 0.86 | 64 hidden units, ReLU |
| LSTM | 57% | 0.96 | 0.91 | 64→32 units, dropout 0.3 |
| Transformer | 50% | 0.95 | 0.77 | Attention-based temporal modeling |
- Power spectral density extraction (delta, theta, alpha, sigma bands)
- Sleep spindle detection (12–14 Hz sigma band)
- 3-fold cross-validation on training set
- ROC curves and confusion matrices for all models
├── Sleep.ipynb # FFNN baseline
├── Sleep_with_LSTM.ipynb # LSTM implementation
├── SleepTransformer.ipynb # Transformer implementation
└── README.md
All models achieved similar overall accuracy (~50-57%), but with different class-wise performance:
- N3 (deep sleep): All models perform well (AUC 0.95-0.96) due to distinctive delta waves
- N1 (light sleep): Hardest to classify — frequently confused with Wake and N2
- REM: Poor recall across all models, likely due to class imbalance and reliance on EOG signals
The LSTM showed improved Wake detection (AUC 0.91) by capturing temporal dependencies, while the Transformer struggled with the limited training data.
- PyTorch / Keras
- NumPy, Pandas, Scikit-learn
- MNE (EEG processing)
- Matplotlib, Seaborn
Ghassemi MM, et al. "You Snooze, You Win: The PhysioNet/Computing in Cardiology Challenge 2018." Computing in Cardiology Conference (CinC), 2018.