The recently released genome of
Brassica napus was downloaded from Cotton Research Institute (CRI) of Nanjing Agricultural University in China. (
http://mascotton.njau.edu.cn/Data.htm, v1.1) and used as a reference genome [
19].
FastQC-toolkit (v 0.11.2) was used to filter out the low-quality reads based on the following criteria: (i) reads with ≥10% unidentified nucleotides (N); (ii) reads >50% read length with a Phred quality value ≤10; (iii) reads with the adapter.
The remaining clean reads were aligned to the reference Brassica napus genome using BWA-MEM (v0.7.16a) [
20] and default parameters.
Sequence Alignment/Map tools (SAMtools) (v1.4.1) [21] was applied to sort and index the resulting binary alignment map (BAM) format files. The duplicates were excludeXd using SAMtools (v1.102), and the final sorted bam files were utilized in the downstream analysis. Variant calling and filtering were performed in order to reduce the inaccuracy of the alignment. The local realignment around insertions and deletions, the base quality recalibration of the reads and variant calling was conducted using freebayes Tools (v1.0.2). freebayes **** was used for variant calling [
22,
23]. The variants that fulfilled the following criteria were retained (1) mapping quality filter equivalent to PASS; (2) quality depth (QD) >2; (
3) mapping quality (MQ) >70; (5) QUAL >30. Moreover, the variants were filtered further if the coverage was <10, the cluster SNPs were >2 in a 5 bp window, if the SNP around the Indel was within 5 bp. SV detection and annotation BreakDancer was used to predict the five types of structural variants (SVs): insertions (INSs), deletions (DELs), inversions (INVs), intra-chromosomal translocations (ITXs), and inter-chromosomal translocations (CTXs) from next-generation paired-end sequencing reads utilizing the read pairs mapped with excessive separation distances or orientation. The SVs with read depth < 2 were filtered. Bedtools was employed to annotate the detected DELs, INSs, and INVs. The detection and annotation of CNVs (copy number variations) refers to a normal variation in the number of copies of ≥1 sections of some genomic fragments. We used CNVnator (parameters: -call 100) for the identification of CNVs and bedtools for annotations.