Introduction:
The past few decades have seen the rise of Human Papilloma Virus (HPV)-related Oropharyngeal Squamous Cell Carcinoma (OPSCC). This shift in epidemiology has led to exploration with de-escalation trials that affect our treatment of this cancer. The optimal treatment for T1-2, N0-N1 OPSCC is an ongoing debate as the current understanding of disease processes and advancing technologies are constantly changing.1-3
Current National Comprehensive Cancer Network (NCCN) guidelines recommend single modality treatment with either surgery or radiation for both HPV and non-HPV-related T1-2, N0-N1 OPSCC. While we await the results of several ongoing clinical trials, systematic and retrospective reviews suggest no difference in survival outcomes for the treatment of early-stage OPSCC between either modality.1,4
Given equivalent survival outcomes with either treatment modality in this population, little work has been conducted looking at the influence patient, socioeconomic, regional, or institutional factors have in primary treatment modality for this category of OPSCC. This question is ideally analyzed using large national data registries and a methodology equipped to analyze multiple layers of influence.
Machine learning (ML) is a novel form of analysis that uses sophisticated statistical theories to create a prediction model.5,6 Many of the statistical principles vital to the machine learning process are similar to traditional statistical methodologies used in clinical medicine, but the primary objective of machine learning is to predict an unknown component rather that determine inferences.5,7,8 Machine learning excels in its ability to analyze complicated interactions that exist between these variables.56 There is growing interest in various fields of medicine to use machine learning to improve upon current methodologies.5-7
This study therefore seeks to utilize machine learning to create a prediction model for the primary treatment modality of patients with T1-2, N0-N1 OPSCC by examining patient, socioeconomic, regional, and institutional factors in addition to tumor factors. In doing so, this study will demonstrate how machine learning can be utilized to create prediction models in a reproducible manner, and provide insight to the variables that influence treatment patterns.