Abstract
Study Objectives:
Inter-scorer variability in sleep staging of polysomnograms (PSGs) results primarily from difficulty in determining whether: (1) an electroencephalogram pattern of wakefulness spans > 15 sec in transitional epochs, (2) spindles or K complexes are present, and (3) duration of delta waves exceeds 6 sec in a 30-sec epoch. We hypothesized that providing digitally derived information about these variables to PSG scorers may reduce inter-scorer variability.
Methods:
Fifty-six PSGs were scored (five-stage) by two experienced technologists, (first manual, M1). Months later, the technologists edited their own scoring (second manual, M2). PSGs were then scored with an automatic system and the same two technologists and an additional experienced technologist edited them, epoch-by-epoch (Edited-Auto). This resulted in seven manual scores for each PSG. The two M2 scores were then independently modified using digitally obtained values for sleep depth and delta duration and digitally identified spindles and K complexes.
Results:
Percent agreement between scorers in M2 was 78.9 ± 9.0% before modification and 96.5 ± 2.6% after. Errors of this approach were defined as a change in a manual score to a stage that was not assigned by any scorer during the seven manual scoring sessions. Total errors averaged 7.1 ± 3.7% and 6.9 ± 3.8% of epochs for scorers 1 and 2, respectively, and there was excellent agreement between the modified score and the initial manual score of each technologist.
Conclusions:
Providing digitally obtained information about sleep depth, delta duration, spindles and K complexes during manual scoring can greatly reduce interrater variability in sleep staging by eliminating the guesswork in scoring epochs with equivocal features.
Citation:
Younes M, Hanly PJ. Minimizing interrater variability in staging sleep by use of computer-derived features. J Clin Sleep Med 2016;12(10):1347–1356.



Similar content being viewed by others
Abbreviations
- AHI:
-
apnea-hypopnea index
- CPAP:
-
continuous positive airway pressure
- EEG:
-
electroencephalogram
- EMG:
-
electromyogram
- ICC:
-
intraclass correlation coefficient
- MSS:
-
Michele sleep scoring system
- N1:
-
stage 1 of non-rapid eye movement sleep
- N2:
-
stage 2 of non-rapid eye movement sleep
- N3:
-
stage 3 of non-rapid eye movement sleep
- NREM:
-
non-rapid eye movement sleep
- ORP:
-
odds ratio product
- OSA:
-
obstructive sleep apnea
- PLM:
-
periodic limb movement
- PSG:
-
polysomnogram
- REM:
-
rapid eye movement sleep
- TST:
-
total sleep time
- W:
-
stage awake
REFERENCES
Ferri R, Ferri P, Colognola RM, Petrella MA, Musumeci SA, Bergonzi P. Comparison between the results of an automatic and a visual scoring of sleep EEG recordings. Sleep 1989;12:354–62.
Norman RG, Pal I, Stewart C, Walsleben JA, Rapoport DM. Interobserver agreement among sleep scorers from different centers in a large dataset. Sleep 2000;23:901–8.
Collop NA. Scoring variability between polysomnography technologists in different sleep laboratories. Sleep Med 2002;3:43–7.
Danker-Hopfe H, Kunz D, Gruber G, et al. Interrater reliability between scorers from eight European sleep laboratories in subjects with different sleep disorders. J Sleep Res 2004;13:63–9.
Pittman SD, MacDonald MM, Fogel RB, et al. Assessment of automated scoring of polysomnographic recordings in a population with suspected sleepdisordered breathing. Sleep 2004;27:1394–403.
Anderer P, Gruber G, Parapatics S, et al. An E-health solution for automatic sleep classification according to Rechtschaffen and Kales: validation study of the Somnolyzer 24 × 7 utilizing the Siesta database. Neuropsychobiology 2005;51:115–33.
Magalang UJ, Chen NH, Cistulli PA, et al. Agreement in the scoring of respiratory events and sleep among international sleep centers. Sleep 2013;36:591–6.
Kuna ST, Benca R, Kushida CA, et al. Agreement in computer-assisted manual scoring of polysomnograms across sleep centers. Sleep 2013;36:583–9.
Malhotra A, Younes M, Kuna ST, et al. Performance of an automated polysomnography scoring system vs. computer-assisted manual scoring. Sleep 2013;36:573–82.
Zhang X, Dong X, Kantelhardt JW, et al. Process and outcome for international reliability in sleep scoring. Sleep Breath 2015;19:191–5.
Rosenberg RS, Van Hout S. The American Academy of Sleep Medicine interscorer reliability program: sleep stage scoring. J Clin Sleep Med 2013;9:81–7.
Younes M, Raneri J, Hanly P. Staging sleep in polysomnograms: analysis of inter-scorer variability. J Clin Sleep Med 2016;12:885–94.
Berry RB, Brooks R, Gemaldo CE, Harding SM, Marcus CL, Vaughn BV for the American Academy of Sleep Medicine. The AASM manual for the scoring of sleep and associated events: rules, terminology and technical specifications, version 2.0. www.aasmnet.org. Darian, IL: American Academy of Sleep Medicine, 2012.
Warby SC, Wendt SL, Welinder P, et al. Sleep-spindle detection: crowdsourcing and evaluating performance of experts, non-experts and automated methods. Nat Meth 2014;11:385–92.
Wendt SL, Welinder P, Sorensen HB, et al. Inter-expert and intra-expert reliability in sleep spindle scoring. Clin Neurophysiol 2015;126:1548–56.
Younes M, Ostrowski M, Soiferman M, et al. Odds ratio product of sleep EEG as a continuous measure of sleep state. Sleep 2015;38:641–54.
Martin N, Lafortune N, Godbout J, et al. Topography of age-related changes in sleep spindles. Neurobiol Aging 2013;34:468–76.
Ferrarelli F, Huber R, Peterson MJ, et al. Reduced sleep spindle activity in schizophrenia patients. Am J Psychiatry 2007;164:483–92.
Wamsley EJ, Tucker MA, Shinn AK, et al. Reduced sleep spindles and spindle coherence in schizophrenia: mechanisms of impaired memory consolidation? Biol Psychiatry 2012;71:154–61.
Mölle M, Marshall L, Gais S, Born J. Grouping of spindle activity during slow oscillations in human non-rapid eye movement sleep. J Neurosci 2002;22:10941–7.
Bódizs R, Körmendi J, Rigó P, Lázár AS. The individual adjustment method of sleep spindle analysis: methodological improvements and roots in the fingerprint paradigm. J Neurosci Meth 2009;178:205–13.
Wendt SL, Christensen JA, Kempfner J, et al. Validation of a novel automatic sleep spindle detector with high performance during sleep in middle aged subjects. Conf Proc IEEE Eng Med Biol Soc 2012;2012:4250–3.
Devuyst S, Dutoit T, Stenuit P, Kerkhofs M. Automatic K-complexes detection in sleep EEG recordings using likelihood thresholds. Conf Proc IEEE Eng Med Biol Soc 2010;2010:4658–61.
Lajnef T, Chaibi S, Eichenlaub JB, et al. Sleep spindle and K-complex detection using tunable Q-factor wavelet transform and morphological component analysis. Front Hum Neurosci 2015;9:414.
Ray LB, Sockeel S, Soon M, et al. Expert and crowd-sourced validation of an individualized sleep spindle detection method employing complex demodulation and individualized normalization. Front Hum Neurosci 2015;9:507.
Younes M, Thompson W, Leslie C, Egan T, Giannouli E. Utility of technologist editing of polysomnography scoring performed by a validated automatic system. Ann Am Thorac Soc 2015;12:1206–18.
Meza S, Giannouli E, Younes M. Enhancements to the multiple sleep latency test. Nat Sci Sleep 2016;8:145–58.
ACKNOWLEDGMENTS
The authors thank Colleen Leslie, John Laprairie, and Michele Ostrowski for scoring the PSGs and for coordinating the study.
Author information
Authors and Affiliations
Corresponding author
Additional information
Address correspondence to: Magdy Younes, MD, 1001 Wellington Crescent, Winnipeg, Manitoba Canada R3M 0A7; Email: mkyounes@shaw.ca
Electronic Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Younes, M., Hanly, P. Minimizing Interrater Variability in Staging Sleep by Use of Computer-Derived Features. J Clin Sleep Med 12, 1347–1356 (2016). https://doi.org/10.5664/jcsm.6186
Received:
Revised:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.5664/jcsm.6186


