RESULTS
Gene expression of differentiation markers was evaluated for specific stages: stem cells (NANOG , POU5F1 , SOX2 ), definitive endoderm (CXCR4 , FOXA2 , SOX17 ), hepatoblasts (AFP , CK19 , HNF4A) , hepatocyte (ALB , CYP3A4 , CK18 , HNF1A , G6PC ,TDO2 , UGT1A1 ). In addition, the gene expression data of genes associated with the DNA methylation machinery (DNMT1 ,DNMT3A , DNMT3B , DNMT3L , TET1 , TET2 ,TET3 , UHRF1 ) were recovered from our previous study15.
Pairwise correlation analysis of the 24 genes expression pointed to three groups with correlated expression patterns (Figure 1 ): (1) AFP, CK19, UHRF1, DNMT3B, HNF1A, FOXA2, HNF4A , SOX2 ; (2) DNMT3A, TET3, TET1, DNMT1 , TET2 ; and (3) ALB, TDO2, CYP3A4, CK18 , and UGT1A1 .
Based on the (Supporting Information Figure S1A and S1B ) and K-means (Supporting Information Figure S2) The Set-1 cluster (in black) was composed of hepatoblasts and hepatocyte-like cells and only three HBs (33T, 35T, and 43T), suggesting a gene expression profile similar to transitional phases of the hepatocyte differentiation. A second group (Set-2, in red) contained nine HB samples (18T, 28T 32T, 37T, 38T, 40T, 42T, 44T, and 45T). The larger group (Set-3, in green) clustered all 9 non-tumoral liver samples plus nine HBs (15T, 17T, 30T, 31T, 34T, 36T, 39T, 41T and 46T), showing that these tumors exhibited a gene expression profile similar to differentiated livers. The Set-2 cluster (red) presented a partial overlap to Set-1 and Set-3, pointing to mixed characteristics of gene expression, in a more advanced phase of hepatocyte differentiation when compared to the Set-1 group. Finally, an isolated group consisting of only the iPSC and the definitive endoderm cells was detected (Set-4, in blue), exhibiting the most distinctive gene expression pattern when compared to the other groups, as expected.
Differential expression analysis indicated that 13 out the 24 genes showed differences between Set-1, Set-2 and Set-3 groups (Supporting Information Figure S3) . From these thirteen genes, six genes were associated with the epigenetic machinery (TET1, TET2, TET3, DNMT1, DNMT3A, and UHRF1 ), a definitive endoderm marker (FOXA2 ), two hepatoblast markers (AFP andHNF4 A), and four hepatocyte markers (ALB, CYP3A4, TDO2 , and UGT1A1 ). The gene expression analysis evidenced that Set-1 tumors have upregulation of hepatoblast markers (Figure 3A-C ), and high expression of the DNA methylation genes, mainly UHRF1and TET1 (Figure 3D-E ). The Set-2 tumors exhibit upregulation of the DNA methylation genes (Figure 3D-I ) and low expression of markers of both the intermediary stages of differentiation (Figure 3A-C ) and mature hepatocyte genes (Figure 3J-M ). Finally, in Set-3, samples show down-regulation of the DNA methylation genes (Figure 3D-I ), associated with high expression of mature hepatocyte marker genes (Figure 3J-K ). With the determination of these 13 genes derived from differential expression, the analysis of gene distance showed that these genes have three groups with a similar expression pattern (UHRF1, AFP, FOXA2and HNF4A ; DNMT3A, TET3, TET1, DNMT1 , and TET2 ;ALB, TDO2, CYP3A4, and UGT1A1 ). Besides, the analysis showed that in HB genes of epigenetic machinery have an inverse expression pattern to the mature hepatocyte’s markers (Supporting Information Figure S4 ).
The thirteen genes were submitted to Reactome Pathways analysis, searching for 1) pathways related to biological mechanisms and 2) pathways that are specific for cancer. From the 158 pathways (p<0.05) (Supporting Information Table S1 ), the top five were mRNA splicing - major pathway (R-HSA-72163), mRNA splicing (R-HSA-72172), processing of capped intron-containing pre-mRNA (R-HSA-72203), apoptosis (R-HSA-109581) and protein localization (R-HSA-9609507). Considering cancer pathways, from 33 pathways (p<0.05) (Supporting Information Table S2 ), the top five were endosomal/vacuolar pathway (R-HSA-1236977), TP53regulates transcription of DNA repair genes (R-HSA-6796648), signaling by BRAF and RAF fusions (R-HSA-6802952), constitutive signaling by aberrant PI3K in cancer (R-HSA-2219530) and oncogenic MAPK signaling (R-HSA-6802957).
Comparison of the analysis of global DNA methylation between the three groups (Figure 4 ) showed that the Set-3 presented an overall methylation level similar to non-tumoral liver samples, in agreement with expression data. The samples from the Set-1, that presented expression profile similar to embryonic stages of the hepatocyte differentiation, had a lower level of global DNA methylation than the other two groups. The Set-2 samples, that contain HBs with transitional gene expression profile when compared to the others, also had an intermediate mean level of DNA methylation.
Table 1 presents the clinical characteristics of the HB cases, clustered according to the Set in which they were classified based on their gene expression profile. Moreover, the somatic driver mutations previously detected in some HBs, as well as the results of the immunohistochemistry analysis of the beta-catenin are presented (data derived from Aguiar et al. 2020 26). Set-1 was composed of intermediate-risk HBs, according to the CHIC criteria27,28. It is interesting to note that three out of nine patients of the Set-3 died, three developed pulmonary metastases, and one was characterized as HB/HCC features (carrying a TERTpromoter mutation; Supporting Information Figure S5 – Kaplan-Meier survival curve ). The Set-2 contains tumors that are clinically heterogeneous, with two cases of late diagnosis, one of them carrying a somatic mutation in the TERT promoter.
Despite the number of samples in each set, it is possible to notice a trend in relation to histology. In Set-1 there is a predominance of epithelial embryonal, Set-2 exhibits a predominance of epithelial and mesenchymal mixed, and in Set-3, epithelial fetal samples.Figure 5 summarizes our findings and proposes the stratification of this cohort of HBs.