Real-World Data

Two publicly available datasets with ternary ordinal outcomes were analyzed with the cumulative ROC curve approach where cutpoints were selected with the Total Accuracy criterion and computed parametrically. Panel (a) in Figures \ref{550411} -- \ref{370806} displays histograms for each dataset overlaid with Total Accuracy cutpoints, while panel (b) shows the cutpoints on their respective cumulative ROC curves. Tables \ref{Table1} -- \ref{Table2} present the Total Accuracy and parametric cutpoints, as well as their sensitivities, specificities, and AUCs. Confidence intervals for parametric cutpoints were calculated with Fieller's Method \cite{Fieller1944} and for AUCs with Wald's Method \cite{Wald1939}. Cumulative logit regression models and cumulative ROC curves were computed with the FREQ and LOGISTIC subroutines of the SAS software application, version 9.4 \cite{SASInstitute2010}.

Cork Stopper Quality

The data comprise measurements of material defects appearing in digital images of cork stoppers \cite{Campilho1985,DeSa2001}. An automated image processing system optically scanned cork defects and quantified several characteristics, including the number, area, and perimeter of the defects. Fifty cork stoppers were quantified in each of three quality levels \(\left(N=150\right)\), where \(Y=1\ \text{(poor)}\)\(2\ \text{(normal)}\), and \(3\ \text{(superior)}\). In the Stage 1 cumulative logit model, cork stopper quality was predicted by the total number of pixels with defects. The score test for proportional odds (p-value = 0.31) supported a proportional odds configuration for the model. The parameter estimates are: \(\hat{\alpha}_1=-13.64\)\(\hat{\alpha}_2=-7.05\), and \(\hat{\beta}_{area}=-0.036\).
The ability of defect area to discriminate cork stopper quality is excellent, with \(\text{AUCs} \gt 0.97\) for both cumulative ROC curves (Table  \ref{Table1}). Total Accuracy identified cutpoints where the total number of pixels with defects were 205 (distinguishing poor or normal quality vs. superior) and 369 (poor vs. normal, superior), and both had excellent sensitivities and specificities \(\gt 0.93\). Parametrically computed cutpoints were at 194.8 [95%CI: 177.1, 213.7] and 376.8 [352.9, 403.9] pixels, with sensitivities and specificities comparable to those of the Total Accuracy cutpoints, although specificity for the lower cutpoint and sensitivity for the upper cutpoint were somewhat attenuated.

Tobacco Smoke Exposure

Human exposure to chemicals can be estimated from measurements of trace compounds in samples of human urine. Some of these compounds, known as biomarkers, are associated with exposure to tobacco smoke, which may arise either from direct inhalation while smoking, or from indirect inhalation of tobacco smoke present in the environment (i.e., second-hand tobacco smoke; SHS). One such biomarker is a tobacco-specific N-nitrosamine known as  NNAL (4-[methylnitrosamino]-1-[3-pyridyl]-1-butanol; CAS No. 76014-81-8), which is present in both mainstream tobacco smoke and smokeless tobacco products. NNAL was measured in urine from a representative sample of the United States civilian population \(\ge 12\) years old \(\left(N=16,900 \right)\) obtained during the 2007 -- 2012 cycles of the National Health and Nutrition Examination Survey (NHANES; \citealt{CDCNationalCenterforHealthStatisticsa}). Subjects reported being in one of three ordinal exposure categories: non-exposed subjects who neither used tobacco products nor were exposed to SHS \(\left(Y=1; n_1=12,372 \right)\); SHS-exposed subjects who did not smoke tobacco \(\left(Y=2; n_2=927 \right)\); and tobacco smokers \(\left(Y=3; n_3=3,691 \right)\). Subjects were excluded from this analysis if they reported using smokeless tobacco in order to eliminate this potential source of NNAL. The natural log of urinary NNAL concentration predicted exposure levels in the Stage 1 cumulative logit model. The score test (p-value <0.001) supported a non-proportional odds configuration for the model. The parameter estimates are: \(\hat{\alpha}_1=-4.60\)\(\hat{\alpha}_2=-4.08\)\(\hat{\beta}_{ln \left(NNAL \right),1}=-1.13\), and \(\hat{\beta}_{ln \left(NNAL \right),2}=-1.25\).
The ability of \(ln(NNAL)\) to discriminate ternary tobacco smoke exposure levels is excellent with \(\text{AUCs} \gt 0.95\) for both cumulative ROC curves (Table \ref{Table2}). Total Accuracy identified cutpoints at \(ln(NNAL)\) concentrations of -4.09 (non-exposed vs. SHS-exposed, smokers) and -3.38 ng/mL (non-exposed, SHS-exposed vs. smokers). Since the non-proportional odds configuration permits each cumulative ROC curve to differ in discriminatory power, the cumulative ROC curve associated with the lower cutpoint had an AUC of 0.9515 [95%CI: 0.9476, 0.9554], while the upper cutpoint's curve had an AUC that was slightly, but significantly better at 0.9639 [0.9603, 0.9674]. Parametric cutpoints are at -4.05 [95%CI: -4.11, -4.00] and -3.26 [-3.32, -3.21] ng/mL. Total Accuracy's upper cutpoint was below the parametric upper cutpoint's lower 95 percent confidence limit, but it is unclear which is preferable. The Total Accuracy upper cutpoint had excellent sensitivity (0.9597) and good specificity (0.8067), but this was reversed for the parametric upper cutpoint, which had good sensitivity (0.8092) and excellent specificity (0.9585). Another basis for comparison is the Total Accuracy criterion, which can be calculated for parametric cutpoints from their \(TP_j\)\(TN_j\)\(FP_j\), and \(FN_j\). This, too, failed to be conclusive since the criterion for the Total Accuracy vs. parametric upper cutpoints were hardly different at 0.9163 vs. 0.9161.