Variability in diagnostic error rates of 10 MRI centers performing lumbar spine MRI examinations on the same patient within a 3-week period.

Overview

abstract

BACKGROUND CONTEXT: In today's health-care climate, magnetic resonance imaging (MRI) is often perceived as a commodity-a service where there are no meaningful differences in quality and thus an area in which patients can be advised to select a provider based on price and convenience alone. If this prevailing view is correct, then a patient should expect to receive the same radiological diagnosis regardless of which imaging center he or she visits, or which radiologist reviews the examination. Based on their extensive clinical experience, the authors believe that this assumption is not correct and that it can negatively impact patient care, outcomes, and costs. PURPOSE: This study is designed to test the authors' hypothesis that radiologists' reports from multiple imaging centers performing a lumbar MRI examination on the same patient over a short period of time will have (1) marked variability in interpretive findings and (2) a broad range of interpretive errors. STUDY DESIGN: This is a prospective observational study comparing the interpretive findings reported for one patient scanned at 10 different MRI centers over a period of 3 weeks to each other and to reference MRI examinations performed immediately preceding and following the 10 MRI examinations. PATIENT SAMPLE: The sample is a 63-year-old woman with a history of low back pain and right L5 radicular symptoms. OUTCOME MEASURES: Variability was quantified using percent agreement rates and Fleiss kappa statistic. Interpretive errors were quantified using true-positive counts, false-positive counts, false-negative counts, true-positive rate (sensitivity), and false-negative rate (miss rate). METHODS: Interpretive findings from 10 study MRI examinations were tabulated and compared for variability and errors. Two of the authors, both subspecialist spine radiologists from different institutions, independently reviewed the reference examinations and then came to a final diagnosis by consensus. Errors of interpretation in the study examinations were considered present if a finding present or not present in the study examination's report was not present in the reference examinations. RESULTS: Across all 10 study examinations, there were 49 distinct findings reported related to the presence of a distinct pathology at a specific motion segment. Zero interpretive findings were reported in all 10 study examinations and only one finding was reported in nine out of 10 study examinations. Of the interpretive findings, 32.7% appeared only once across all 10 of the study examinations' reports. A global Fleiss kappa statistic, computed across all reported findings, was 0.20±0.06, indicating poor overall agreement on interpretive findings. The average interpretive error count in the study examinations was 12.5±3.2 (both false-positives and false-negatives). The average false-negative count per examination was 10.9±2.9 out of 25 and the average false-positive count was 1.6±0.9, which correspond to an average true-positive rate (sensitivity) of 56.4%±11.7 and miss rate of 43.6%±11.7. CONCLUSIONS: This study found marked variability in the reported interpretive findings and a high prevalence of interpretive errors in radiologists' reports of an MRI examination of the lumbar spine performed on the same patient at 10 different MRI centers over a short time period. As a result, the authors conclude that where a patient obtains his or her MRI examination and which radiologist interprets the examination may have a direct impact on radiological diagnosis, subsequent choice of treatment, and clinical outcome.