3.2 Genome annotation
In total, 342 Mb repeat sequences were identified, accounting for 58.53% of C. japonica genome, which was similar as the close relatives A. pernyi (60.74%), but higher than A. yamamai (37.33%) and S. ricini (34.3%) (Table S1). The Unclassified, long interspersed nuclear elements (LINEs) and DNA transposons are the most abundant TEs. Genome annotation by de novo gene prediction and homologous gene search yields 24791 protein-coding genes, much more than the number in A. pernyi(20814), A. yamamai (14638) and S. ricini (20366) (Table 2).