2. Assessment of the genomic context of tri5 genes reveals its potential involvement in unknown biosynthetic pathways inTrichoderma
The presence of tri5 orthologs in Trichoderma spp. that have not been described as trichothecene-producers, such as T. gamsii , T. asperellum and T. guizhouense , suggests this gene could be involved in the biosynthesis of different trichodiene-derivatives. Multi-sequence alignment of TRI5 proteins showed the active centre is highly conserved, sharing DDSRE/DDSIE aspartate-rich motif and NDLFSFYKE triad (Fig. 2a ). Pairwise alignments of each TRI5 protein with that of T. arundinaceum andT. brevicompactum showed 77% of amino-acid identity in T. guizhouense , 80-82% in T. asperellum , respectively, and 87% inT. gamsii .
Assessment of the genomic context of tri5 genes by antiSMASH 5.0 (Blin et al., 2019) revealed this gene is included in a 21.2 kb cluster in T. gamsii , enclosing another 6 genes that were named asA , B , C , D , E and F (Fig.2b ). Manual characterization based on conserved domains and similarity with characterized proteins in other systems enabled the identification of four tailoring enzymes, one efflux transporter and one regulatory protein. The three genes located upstream of tri5encode a Zn2-C6 transcription factor (TF) (A ), oxygenase (B ), and alpha-beta hydrolase (C ), while the three located downstream were identified as oxygenase (D ), Major Facilitator Superfamily (MFS) transporter (E ) and carbonic anhydrase (F ).
Alignment of these proteins with the TRI (trichothecene) proteins functionally associated to tri5 in the trichothecene-producer species of the Brevicompactum clade showed no sequence similarity. Furthermore, the genome of T. gamsii lacks on the entire set of genes encoding the TRI proteins, with the exception of a distant related homolog of the gene tri101 , which has been already reported in other Trichoderma species (Proctor et al., 2018). We used the protein sequences encoded in the cluster found in T. gamsii as queries in BLASTp analyses to search for homologous proteins in theTrichoderma genomes used here. Genes A , B andC were also found in all the other Trichoderma spp. belonging to the Viride clade, with conserved synteny (Fig.2c ), and preliminary BLAST analyses suggest that these genes may be originated by horizontal gene transfer (HGT) from a donor belonging to the Eurotiomycetes. In any case, further analyses are needed in order to better understand the evolutionary origins of these genes. Instead, genes D and F are present in some of the genomes analysed in closely related species; while gene E seems to be specific to T. gamsii .
These findings suggest the origin of a novel tri5 -associated cluster in T. gamsii , which is likely involved in the biosynthesis of trichodiene-derivates with unknown functions. According to this, tri5 could participate in two different sesquiterpene biosynthetic pathways in Trichoderma .