Sequence data analysis
After removal of terminal adaptor sequences and low-quality data, reads were mapped to the reference human genome (hg19) and aligned using BWA (0.7.12-r1039). MuTect2 (Cibulskis et al., 2013) (3.4–46-gbc02625) was employed to call somatic small insertions and deletions (InDels) and single nucleotide variants (SNVs). Mutations were considered as a candidate somatic mutation only when (i) the mutation had at least five high-quality reads (Phred score ≥30, mapping quality ≥30, and without paired-end reads bias) containing the particular base; (ii) the mutation was not presented in >1% of population in the 1,000 Genomes Project or dbSNP databases (The Single Nucleotide Polymorphism Database); and (iii) the mutation was not present in an on-house database of normal samples. For somatic tumor mutations, a mutant allele must be present in ≥3% of reads. Somatic non-synonymous mutations per megabase of the panel region were used in tumor mutation burden (TMB) analysis.
Contra (Li et al., 2012) (2.0.8) was used to detect copy number variations and LOH HLA algorithm was used to identify LOH based on informative SNPs. For structure variations (SV), baits were designed to capture selected exons and introns of RET, ALK, ROS1, and NTRK1 oncogenes based on previously reported SVs in these genes and an in-house algorithm was used to identified split-read and discordant read-pair to identify SVs. All final candidate variants were manually verified with the integrative genomics viewer browser.