TCGA and GEO data download
We downloaded COAD mRNA expression data and clinical data from TCGA
(http://portal.gdc.cancer.gov/). A total of 398 COAD samples and 39
normal colon samples were obtained from TCGA for gene expression
analysis and prognosis analysis. Organize and annotate the RNA
sequencing matrix files of different samples to the genome. The mRNA
expression is obtained from the RNA sequencing data matrix file.
Download from GEO (https://www.ncbi.nlm.nih.gov/geo/) to a data set
containing 1048 COAD samples (GSE40976). The data of the TCGA-COAD and
GSE40976 samples are collated, extracted, annotated and standardized by
”Strawberry Perl 5.32.0” and R language (version 4.0.2)
(https://www.r-project.org/).