Protein coding gene prediction
The Isoseq3 pipeline (https://github.com/pacificbiosciences/isoseq) was used to process the full-length transcriptome data of Chinese flowering cabbage to obtain the transcriptome sequence. At the same time, in order to obtain a more complete gene annotation, we integrated the annotation content of B. juncea(J. Yang et al., 2016) , B. napus(Chalhoub et al., 2014) , B. oleracea(Liu et al., 2014) , B. rapa(Zhang et al., 2018) and B. nigra(W. Wang et al., 2019) as the reference gene sequence using CD-HIT-EST (https://github.com/weizhongli/cdhit) to remove the sequence redundancy. The results of repeats sequence found by EDTA(Ou et al., 2019) and TRF(Benson, 1999) were used as reference repeats to enter into MAKER(Cantarel et al., 2008) for 5 rounds of gene and repeat sequence annotation.