Discussion
The cumulative logit model subsumes multinomial ordinal outcome levels within a single model, yet each outcome level gets its own cumulative logit, so that predicted individual probabilities for each level (Eq. \ref{eq:4}) are mutually exclusive, comprehensive over the outcome levels, and sum to unity for each observation of the continuous measurement. Another appeal of the cumulative logit model is that its predicted probabilities (Eq.s \ref{eq:3} and \ref{eq:4}) change in direct proportion to the continuous measurement across all outcome levels. Even more, the ordinality of the outcome ensures that cutpoints separate successive pairs of adjacent outcome levels.
Assuming proportional odds constrains the log-odds of the continuous predictor to be constant for all levels of the ordinal outcome. This imposes statistical equivalence on the AUCs of the cumulative ROC curves, so that the ROC curves will appear approximately overlapped. In contrast, when the log-odds of the predictor are non-proportional, which represents varying strength in the continuous predictor's association at each outcome level, the AUCs of the cumulative ROC curves will differ and the curves will appear nested. Notably, the rank-order of the AUCs (and hence the order of nesting) is independent of the order of the ordinal outcome levels. This flexibility may be especially desirable in certain settings, such as in a clinical trial where a medication is associated with greater potency at the worst level of the health outcome compared to less acute levels.
Evaluated under simulation, cumulative ROC curve analysis performed as expected for a variety of conditions, but with the qualification that if ROC curve-based cutpoint criteria are to be used, results from simulated unbalanced data indicate that Total Accuracy yields minimally biased cutpoints compared to the Youden Index, Matthews Correlation Coefficient, and Markedness. Alternatively, parametric cutpoints have the advantage of being maximum likelihood and consequently had absolute percent-biases that were less than Total Accuracy's and were often negligible.
Analysis of real-world data demonstrated that cumulative ROC curve analysis yields reasonable results. Continuous measurements in both datasets displayed varying degrees of overlap among the ternary outcome levels. The tobacco smoke exposure data were relatively large, but also strikingly unbalanced across the outcome levels, especially at the intermediate outcome level. The intermediate SHS-exposed category was small (5.5 percent) and the distribution was skewed to the extreme exposure levels (72.8 percent non-exposed vs. 21.7 percent smokers). Notwithstanding, the cumulative ROC curve approach identified cutpoints with good to excellent sensitivity and specificity.
The cumulative ROC curve approach readily generalizes to more than three outcome levels through specification of the cumulative logit model. Nonetheless, discriminating discrete outcome levels postulates that the continuous measurement is associated with an a priori number of latent and ordinal classes. If the cumulative logit model in Stage 1 specifies an outcome with \(J \gt 2\) ordinal levels, determination of cutpoints may be difficult if the outcome is actually binomial or is otherwise different than assumed. The magnitude of this difficulty may be revealed in exploratory data analysis, by poor model fit, and by cutpoints with poor sensitivity and specificity. For the tobacco smoke exposure data, although prior assumption of an intermediate outcome level (i.e., secondhand smoke-exposed) was plausible, there was cause for doubt since this level was observed infrequently. In addition, the infrequency of the SHS-exposed outcome level contrasts with the simulated data, where use of a normal distribution as a source of random variates for the continuous predictor leads to more frequent intermediate outcomes (~40 percent) compared to the lower-most and uppermost outcomes (~20 percent each). These considerations notwithstanding, the natural log of urinary NNAL was an excellent discriminator of the three tobacco smoke exposure levels.
The proposed approach admits alternative formulations of the Stage 1 model, where other multinomial models may be implemented through substitution of the cumulative logit link function. Alternative models for ordinal outcomes, such as adjacent categories and continuation ratio (including complementary log-log and Cox proportional hazards), and nominal outcomes (with the generalized logit) all predict probabilities entirely suitable for subsequent calculation of cumulative ROC curves. Conceptual interpretation of these alternative link functions, however, necessarily varies, sometimes substantially, and may therefore be less directly interpretable than the cumulative logit. Exploring the performance and utility of these alternative link functions may nonetheless be fruitful.
Cumulative ROC curve analysis appears to be efficacious for a univariate continuous predictor, and the regression framework may be extended with the addition of covariates to the Stage 1 cumulative logit model. This can be expected to enhance discriminatory power by accounting for other influential or potentially confounding influences \cite{Tosteson1988}. In any particular case, however, it may not be clear whether additional covariates will adversely affect the overall concavity of the cumulative ROC curves, thereby hindering selection of cutpoints. Stratification by potentially confounding factors may be helpful in resolving these difficulties.
One challenge posed by the cumulative logit model is its sample size demands, which arise from the potentially numerous parameters that must be estimated. In the proportional odds configuration, the univariate cumulative logit model has \(J-1\) intercepts plus one slope for the continuous predictor, but this nearly doubles in the non-proportional odds configuration, which has \(2 \times \left(J-1\right)\) regression parameters.
Confidence intervals for parametric cutpoints were computed from \(t_{df=2,1-2} \times s_{X^*}\) where \(s_{X^*}\) was estimated with Fieller's Method \cite{Fieller1944}. Since the parametric cutpoint is the ratio of two model parameters, the Delta Method and Fieller's are both applicable for computing the variance \cite{Cox1990a}, but Fieller's Method is favored since it tends to provide better coverage despite potential asymmetry of the confidence interval \cite{Zerbe1978,Hirschberg2010a}. In addition, \citet*{Hirschberg2010a} recommend Fieller's Method when the computed ratios are positive while the correlation between the numerator and denominator are negative.