Traditional statistical methods for evaluating prediction models are uninformative as to clinical value: towards a decision analytic framework.
Review
Overview
abstract
Cancer prediction models are becoming ubiquitous, yet we generally have no idea whether they do more good than harm. This is because current statistical methods for evaluating prediction models are uninformative as to their clinical value. Prediction models are typically evaluated in terms of discrimination or calibration. However, it is generally unclear how high discrimination needs to be before it is considered "high enough"; similarly, there are no rational guidelines as to the degree of miscalibration that would discount clinical use of a model. Classification tables do present the results of models in more clinically relevant terms, but it is not always clear which of two models is preferable on the basis of a particular classification table, or even whether either model should be used at all. Recent years have seen the development of straightforward decision analytic techniques that evaluate prediction models in terms of their consequences. This depends on the simple approach of weighting true and false positives differently, to reflect that, for example, delaying the diagnosis of a cancer is more harmful than an unnecessary biopsy. Such decision analytic techniques hold the promise of determining whether clinical implementation of prediction models would do more good than harm.