2.6 Gene prediction and annotation
de novo prediction and homologous gene search were used to protein-coding gene annotation in the C. japonica genome. Repeat-masked genome were used to subsequent analysis according to the EVidenceModeler (EVM) v1.1.1 genome annotation pipeline (Haas et al. , 2008). First, we used BRAKER v2 (https://github.com/Gaius-Augustus/BRAKER) to perform de novo gene prediction. Second, the protein sequences of Lepidoptera insect were downloaded from NCBI RefSeq as templates for homologous-based predictions by GenomeThreader v 1.7.3 (Gremme et al. , 2005). Finally, EVidenceModeler was used to integrate the above two evidence with different weights and obtained the GFF3 format files. The number of genes will be annotated finally.