2.4 | Taxonomic and functional annotation
The genes of the RGC were translated into amino acid sequences using NCBI Genetic Codes 11 (Kozak, 1983). Taxonomic annotation of amino acid sequences was performed by Kaiju v1.8.0 (Menzel, Ng, & Krogh, 2016) and the NCBI NR database (released on Feb 24th, 2021), providing a detailed overview of the taxonomical composition of SNMs gut microbiome with the parameters ‘-a greedy -e 5 -E 0.01 -v -z 5’. Functional annotation was performed using the Kyoto Encyclopedia of Genes and Genomes databases (KEGG, Release 100.0, genes from animals or plants were excluded) and Evolutionary genealogy of genes: Non-supervised Orthologous database (eggNOG, v5.0) on the basis of Diamond (v0.9.24) with the parameters ‘–evalue 1e-5 -k 1’ (Benjamin Buchfink, 2015). KEGG and eggNOG annotations were performed using an in-house pipeline, where each protein was assigned to a KEGG orthologous group (KO) or eggNOG orthologous group (OG) when the highest-scoring annotated hits contained at least one alignment with over 60 hits (Qin et al., 2012). Carbohydrate enzyme annotation was carried out through the Carbohydrate-Active enZYmes Database (CAZymes). We mapped protein sequences to entries in the hidden Markov model (HMM) libraries of CAZyme families downloaded from the CAZy database (CAZyDB.07312019) with the hmmscan program in HMMER (3.1b2) (Pattabiraman & Warnow, 2021).