Discrimination.
The discrimination is defined as the model’s ability to distinguish
between participants who do or do not experience the event of interest
(e.g., disease outcome such as hypertension). A good prediction model
can accurately discriminate between those with and without the
outcome5. C-statistic, which is equal to the area
under the receiver operating characteristic (ROC) curve for binary
outcomes, is commonly employed to assess discrimination. ROC curve plots
the sensitivity against (1 – specificity) for consecutive cutoffs for
the probability of an outcome. The value of a C-statistic (area under
ROC curve) points out to the probability that a randomly selected
subject who experienced the outcome will have a higher predicted
probability of having the outcome occur compared to a randomly selected
subject who did not experience the event. The C-statistic can range from
0.5 to 1, with higher values indicating better predictive models. A
C-statistic of 0.5 indicates the model’s performance in predicting an
outcome is no better than the random chance while a C-statistics of 1
indicates the model perfectly distinguishes those who will experience a
certain outcome and those who will not. Generally, the C-statistic of a
prediction model ranges from 0.6 to 0.85. A model with a C-statistic
ranging from 0.70 to 0.80 is considered adequate, while a range of 0.80
to 0.90 is considered excellent6.
For survival data, an extension of C-statistic called Harrell’s
C-statistic is suggested which indicates the proportion of all pairs of
subjects who can be ordered such that the subject who survived longer
will have the higher predicted survival time than the subjects who
survived shorter, assuming that these subject pairs are selected at
random. Although C-statistic is insensitive to outcome incidence, one
disadvantage of C-statistic is, its interpretation is based on an
artificial situation assumption that we have a pair of patients, one
with and one without the outcome.