Correction and Improvement of HiRise assemblies
We compared the male and female HiRise assemblies to each other using
NUCmer (part of MUMmer v 3.23_3, Kurtz et al., 2004), and to linkage
map information (described below) using ALLMAPS v0.8.12 (Tang et al.,
2015), to identify inconsistencies between the two assemblies, and
between the assemblies and the linkage map information. Aided by
NUCmer-based mummerplot comparisons of the relevant scaffolds, we
resolved inconsistencies by manually flipping and/or shuffling the
Chicago scaffolds that appeared to be mis-joined in the final HiRise
step such that both assemblies were mutually consistent and were also
consistent with the linkage map information. When there were
inconsistencies between the male and female assemblies, we deemed the
assembly most consistent with the linkage map information to be correct.
At this stage we also shortened gaps greater than 1000 bp that were in
the original draft assembly, to 999 bp. We then scaffolded and
gap-filled the assemblies with the corresponding assembly from the other
sex using LINKS v1.8.7 (Warren et al., 2015) and RAILS v1.4.1 (Warren,
2016). We filled the remaining gaps in the scaffolds with ABySS-Sealer
v2.1.5_1 (Paulino et al., 2015) using shotgun sequence data from the
original de novo assembly (NCBI SRA: SRR546193 for female
assembly, and SRR546181 and SRR546185 for male assembly). We removed
contaminant scaffolds identified by PhylOligo v1.0 (Mallet et al.,
2017). We used BUSCO v4.1.4 (Simao et al., 2015) analysis with the
insecta_odb10.2020-09-10 lineage dataset (Kriventseva et al., 2019) to
assess improvement of gene content on the final genome assemblies, and
the input de novo assemblies. Assembly consistency plots were
generated using JupiterPlot v1.0 (Chu, 2020) with Circos v0.69-3
(Krzywinski et al., 2009).