Gene content and arrangement of the maternally inherited markers
Mitochondrial minichromosomes . Neither the Spades nor the aTRAM
method was successful in reconstructing complete sequences of the eleven
mitochondrial minichromosomes. However, the aTRAM assemblies contained
whole coding regions and were used for both the phylogenetic
reconstruction and the comparison of gene content between the SE and SW
lineages. Although yielding considerable genetic differences (Table S2),
the mitochondrial minichromosomes show an identical arrangement of the
genes (shared synteny) in both the SW and SE lineages (Table S3). This
arrangement is also very similar to that in the related louse species,P. spinulosa . Concatenation of the minichromosomes produced a
15,693 bp long matrix. When phylogenetically analyzed, it yielded a tree
with two very distant clusters corresponding to the SE and SW lineages
(Figures 4b, S2). Within both clusters, the distances were significantly
lower than the distance between the clusters. However, the overall range
of distance was higher within the SW than within the SE (Table S2,
Figure 4), likely reflecting the broader geographic sampling range of
the SW lineage.
Legionella. Genomes of the symbiont L. polyplacis revealed
phylogenomic structure parallel to the mtDNA (Figures 4a, S3), with a
deep genetic split between the SW and SE lineages. The complete genomes
displayed a high degree of similarities with all pairwise comparisons
exceeding 99% identity across the 530,063 bp matrix. The contrast
between the intra- and inter-cluster comparisons is better illustrated
by the counts of the observed differences, which were 215-213 within the
SW cluster and 0-113 within the SE cluster, compared to 3,702 - 3,727
between the clusters (Table S2). When comparing the genome sequences, we
did not find any clear instance of missing genes. The majority of the
gaps introduced by genomic alignment span just one or two nucleotides
and were placed in intergenic regions (only one deletion span across 26
nucleotides, also located between the gene coding sequences). The
annotations provided by RAST contained several differences between the
two clusters, indicating that a gene present in one lineage is shortened
or missing in the other cluster. In all of these cases, however, the
differences were not caused by a convincing absence of the gene sequence
but rather by failure of the algorithm to recognize the sequence as
coding a gene, most likely due to the aberrant nature of highly derived
symbiont genomes.