Selection of ORF target regions
Mapping the clustered ORFs from our supertranscriptome to the BUSCO
Metazoa_odb9 database, we retained 633 single-copy and 334
duplicated BUSCO hits, respectively (Table 2, last column), of which 633
and 186, respectively, were retained as likely single-copy orthologous
targets across our ingroup taxa. Evaluation of orthology for ORFs that
mapped to the Unioverse probe set suggested that 186 of the 811
Unioverse loci (22.9%) are affected by homology issues for our
ingroup taxa (which belong to the taxa for which the Unioverse
probe set was designed). In most of these cases, several divergent ORFs
mapped to a single Unioverse locus, suggesting paralogy, but we
also observed instances where Unioverse loci were not
orthologous to their ‘associated’ Bathymodiolus target region, as
indicated by less sequence divergence between Bathymodiolus and
the matching fragment of our ingroup ORFs than betweenBathymodiolus and the associated Unioverse loci.
Nevertheless, our evaluation suggested most Unioverse loci to
be single-copy orthologous in Coelaturini, which resulted in the
addition of 297 ORF targets from the Unioverse probe set
(usually several Unioverse loci map to a single ORF). Mapping
ORFs and subregions among each other resulted in the removal of one ORF
from the duplicate BUSCO selection and another from the
Unioverse set, resulting in a total of 1,114 retained ORFs
which cover 1,677,936 nucleotides (on average 1506 nt/ORF).