Skip to main content
Log in

Minimizing Interrater Variability in Staging Sleep by Use of Computer-Derived Features

  • Scientific Investigations
  • Published:
Journal of Clinical Sleep Medicine Aims and scope Submit manuscript

Abstract

Study Objectives:

Inter-scorer variability in sleep staging of polysomnograms (PSGs) results primarily from difficulty in determining whether: (1) an electroencephalogram pattern of wakefulness spans > 15 sec in transitional epochs, (2) spindles or K complexes are present, and (3) duration of delta waves exceeds 6 sec in a 30-sec epoch. We hypothesized that providing digitally derived information about these variables to PSG scorers may reduce inter-scorer variability.

Methods:

Fifty-six PSGs were scored (five-stage) by two experienced technologists, (first manual, M1). Months later, the technologists edited their own scoring (second manual, M2). PSGs were then scored with an automatic system and the same two technologists and an additional experienced technologist edited them, epoch-by-epoch (Edited-Auto). This resulted in seven manual scores for each PSG. The two M2 scores were then independently modified using digitally obtained values for sleep depth and delta duration and digitally identified spindles and K complexes.

Results:

Percent agreement between scorers in M2 was 78.9 ± 9.0% before modification and 96.5 ± 2.6% after. Errors of this approach were defined as a change in a manual score to a stage that was not assigned by any scorer during the seven manual scoring sessions. Total errors averaged 7.1 ± 3.7% and 6.9 ± 3.8% of epochs for scorers 1 and 2, respectively, and there was excellent agreement between the modified score and the initial manual score of each technologist.

Conclusions:

Providing digitally obtained information about sleep depth, delta duration, spindles and K complexes during manual scoring can greatly reduce interrater variability in sleep staging by eliminating the guesswork in scoring epochs with equivocal features.

Citation:

Younes M, Hanly PJ. Minimizing interrater variability in staging sleep by use of computer-derived features. J Clin Sleep Med 2016;12(10):1347–1356.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
The alternative text for this image may have been generated using AI.
Figure 2
The alternative text for this image may have been generated using AI.
Figure 3
The alternative text for this image may have been generated using AI.

Similar content being viewed by others

Abbreviations

AHI:

apnea-hypopnea index

CPAP:

continuous positive airway pressure

EEG:

electroencephalogram

EMG:

electromyogram

ICC:

intraclass correlation coefficient

MSS:

Michele sleep scoring system

N1:

stage 1 of non-rapid eye movement sleep

N2:

stage 2 of non-rapid eye movement sleep

N3:

stage 3 of non-rapid eye movement sleep

NREM:

non-rapid eye movement sleep

ORP:

odds ratio product

OSA:

obstructive sleep apnea

PLM:

periodic limb movement

PSG:

polysomnogram

REM:

rapid eye movement sleep

TST:

total sleep time

W:

stage awake

REFERENCES

  1. Ferri R, Ferri P, Colognola RM, Petrella MA, Musumeci SA, Bergonzi P. Comparison between the results of an automatic and a visual scoring of sleep EEG recordings. Sleep 1989;12:354–62.

    Google Scholar 

  2. Norman RG, Pal I, Stewart C, Walsleben JA, Rapoport DM. Interobserver agreement among sleep scorers from different centers in a large dataset. Sleep 2000;23:901–8.

    Google Scholar 

  3. Collop NA. Scoring variability between polysomnography technologists in different sleep laboratories. Sleep Med 2002;3:43–7.

    Google Scholar 

  4. Danker-Hopfe H, Kunz D, Gruber G, et al. Interrater reliability between scorers from eight European sleep laboratories in subjects with different sleep disorders. J Sleep Res 2004;13:63–9.

    Google Scholar 

  5. Pittman SD, MacDonald MM, Fogel RB, et al. Assessment of automated scoring of polysomnographic recordings in a population with suspected sleepdisordered breathing. Sleep 2004;27:1394–403.

    Google Scholar 

  6. Anderer P, Gruber G, Parapatics S, et al. An E-health solution for automatic sleep classification according to Rechtschaffen and Kales: validation study of the Somnolyzer 24 × 7 utilizing the Siesta database. Neuropsychobiology 2005;51:115–33.

    Google Scholar 

  7. Magalang UJ, Chen NH, Cistulli PA, et al. Agreement in the scoring of respiratory events and sleep among international sleep centers. Sleep 2013;36:591–6.

    Google Scholar 

  8. Kuna ST, Benca R, Kushida CA, et al. Agreement in computer-assisted manual scoring of polysomnograms across sleep centers. Sleep 2013;36:583–9.

    Google Scholar 

  9. Malhotra A, Younes M, Kuna ST, et al. Performance of an automated polysomnography scoring system vs. computer-assisted manual scoring. Sleep 2013;36:573–82.

    Google Scholar 

  10. Zhang X, Dong X, Kantelhardt JW, et al. Process and outcome for international reliability in sleep scoring. Sleep Breath 2015;19:191–5.

    Google Scholar 

  11. Rosenberg RS, Van Hout S. The American Academy of Sleep Medicine interscorer reliability program: sleep stage scoring. J Clin Sleep Med 2013;9:81–7.

    Google Scholar 

  12. Younes M, Raneri J, Hanly P. Staging sleep in polysomnograms: analysis of inter-scorer variability. J Clin Sleep Med 2016;12:885–94.

    Google Scholar 

  13. Berry RB, Brooks R, Gemaldo CE, Harding SM, Marcus CL, Vaughn BV for the American Academy of Sleep Medicine. The AASM manual for the scoring of sleep and associated events: rules, terminology and technical specifications, version 2.0. www.aasmnet.org. Darian, IL: American Academy of Sleep Medicine, 2012.

    Google Scholar 

  14. Warby SC, Wendt SL, Welinder P, et al. Sleep-spindle detection: crowdsourcing and evaluating performance of experts, non-experts and automated methods. Nat Meth 2014;11:385–92.

    Google Scholar 

  15. Wendt SL, Welinder P, Sorensen HB, et al. Inter-expert and intra-expert reliability in sleep spindle scoring. Clin Neurophysiol 2015;126:1548–56.

    Google Scholar 

  16. Younes M, Ostrowski M, Soiferman M, et al. Odds ratio product of sleep EEG as a continuous measure of sleep state. Sleep 2015;38:641–54.

    Google Scholar 

  17. Martin N, Lafortune N, Godbout J, et al. Topography of age-related changes in sleep spindles. Neurobiol Aging 2013;34:468–76.

    Google Scholar 

  18. Ferrarelli F, Huber R, Peterson MJ, et al. Reduced sleep spindle activity in schizophrenia patients. Am J Psychiatry 2007;164:483–92.

    Google Scholar 

  19. Wamsley EJ, Tucker MA, Shinn AK, et al. Reduced sleep spindles and spindle coherence in schizophrenia: mechanisms of impaired memory consolidation? Biol Psychiatry 2012;71:154–61.

    Google Scholar 

  20. Mölle M, Marshall L, Gais S, Born J. Grouping of spindle activity during slow oscillations in human non-rapid eye movement sleep. J Neurosci 2002;22:10941–7.

    Google Scholar 

  21. Bódizs R, Körmendi J, Rigó P, Lázár AS. The individual adjustment method of sleep spindle analysis: methodological improvements and roots in the fingerprint paradigm. J Neurosci Meth 2009;178:205–13.

    Google Scholar 

  22. Wendt SL, Christensen JA, Kempfner J, et al. Validation of a novel automatic sleep spindle detector with high performance during sleep in middle aged subjects. Conf Proc IEEE Eng Med Biol Soc 2012;2012:4250–3.

    Google Scholar 

  23. Devuyst S, Dutoit T, Stenuit P, Kerkhofs M. Automatic K-complexes detection in sleep EEG recordings using likelihood thresholds. Conf Proc IEEE Eng Med Biol Soc 2010;2010:4658–61.

    Google Scholar 

  24. Lajnef T, Chaibi S, Eichenlaub JB, et al. Sleep spindle and K-complex detection using tunable Q-factor wavelet transform and morphological component analysis. Front Hum Neurosci 2015;9:414.

    Google Scholar 

  25. Ray LB, Sockeel S, Soon M, et al. Expert and crowd-sourced validation of an individualized sleep spindle detection method employing complex demodulation and individualized normalization. Front Hum Neurosci 2015;9:507.

    Google Scholar 

  26. Younes M, Thompson W, Leslie C, Egan T, Giannouli E. Utility of technologist editing of polysomnography scoring performed by a validated automatic system. Ann Am Thorac Soc 2015;12:1206–18.

    Google Scholar 

  27. Meza S, Giannouli E, Younes M. Enhancements to the multiple sleep latency test. Nat Sci Sleep 2016;8:145–58.

    Google Scholar 

Download references

ACKNOWLEDGMENTS

The authors thank Colleen Leslie, John Laprairie, and Michele Ostrowski for scoring the PSGs and for coordinating the study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Magdy Younes MD.

Additional information

Address correspondence to: Magdy Younes, MD, 1001 Wellington Crescent, Winnipeg, Manitoba Canada R3M 0A7; Email: mkyounes@shaw.ca

Electronic Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary pdf files (PDF 361 KB) (download PDF )

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Younes, M., Hanly, P. Minimizing Interrater Variability in Staging Sleep by Use of Computer-Derived Features. J Clin Sleep Med 12, 1347–1356 (2016). https://doi.org/10.5664/jcsm.6186

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.5664/jcsm.6186

Keywords