2.7 Gene family analyses
We identified the homologous gene families involved in flowering time,
flower development, and flavonoid and carotenoid biosynthesis inS. tetraptera . The known genes from each family were downloaded
as the query to search against the S. tetraptera genome using
BLASTP (Rédei, 2008). HMMER (Eddy, 2011) was then used to search for
previously known domains from corresponding gene families for the
candidate sequences. The candidate genes not harboring the domains
searched for were removed. All the query sequences and the previously
known domains are summarized in Table S23-24 and Table S27-28. For each
gene family, MAFFT was used to align the protein sequences. IQ-TREE was
used to construct the phylogenetic trees with default parameters (Nguyen
et al., 2015), and further illustrated by EVOLVIEW (Z. He et al., 2016).
We also predicted the transcription factors in the S. tetrapteragenome using PlantRegMap (Tian, Yang, Meng, Jin, & Gao, 2020) and the
PlantTFDB database (Jin et al., 2017). In addition, clusterProfiler
v3.6.0 (R package) (G. Yu, Wang, Han, & He, 2012) was used to analyze
the enrichment of gene families in this study.