2.7 Phylogeny and divergence time estimation
Marsupials (metatherians) and eutherians diverged ~160
million years ago, long before the radiation of extant eutherian clades
(~100 million years ago) (Luo, Yuan, Meng, & Ji, 2011;
Phillips, Bennett, & Lee, 2009). We employed the platypus
(Ornithorhynchus anatinus ) as an outgroup as this species is
frequently used as an outgroup for both eutherian and marsupial mammals
and it has a high-continuity, well-annotated genome (contig N50 15.1 Mb;
scaffold N50 83.3 Mb). In contrast, the genomes of the most closely
related eutherian mammals, the edentates (sloth, armadillo and
anteaters) currently have relatively poor-quality genome assemblies.
These include, for example, the armadillo Dasypus novemcinctus(contig N50 0.03 Mb, scaffold N50 1.7 Mb) and the sloth Choloepus
hoffmanni (contig N50 0.06 Mb; scaffold N50 0.4 Mb).
We identified 7,116 high-confidence 1:1 orthologs by interrogating the
predicted proteins from the gene models of 11 species (nine marsupials
and the platypus) using SonicParanoid v1.3.0 (Cosentino & Iwasaki,
2019). Because the A. arktos , A. argentus , and M.
murexia pseudo-genomes were derived from A. flavipes , their
genes were extracted using the A. flavipes gene annotation file.
The corresponding coding sequences (CDS) for each species were aligned
using PRANK v100802 (Loytynoja & Goldman, 2005) and filtered by Gblocks
v0.91b (Talavera & Castresana, 2007) to identify conserved blocks
(removing gaps, ambiguous sites, and excluding alignments less than 300
bp in size), leaving 7,116 genes. Maximum-likelihood (ML) phylogenetic
trees were generated using RAxML v7.2.8 (Stamatakis, 2006) and FastTree
v2.1.10 (Price, Dehal, & Arkin, 2010) with three CDS data sets: the
whole sequence, first codon positions, and fourfold degenerate (4d)
sites. Identical topologies and similar support values were obtained
(1,000 bootstrap iterations were performed). The divergence time between
species was estimated using MCMCTree [Bayesian molecular clock model
implemented in PAML v4.7 (Yang, 2007)] with the JC69 nucleotide
substitution model (Jukes & Cantor, 1969), and 4d ML tree and
concatenated supergenes of first codon positions and fourfold degenerate
(4d) codons as inputs. We used 100,000 iterations after a burn-in of
10,000 iterations. MCMCTree calibration points (million years ago; Mya)
were obtained from TimeTree (Kumar, Stecher, Suleski, & Hedges, 2017):O. anatinus -P. cinereus (~167-192 Mya),V. ursinus -D. virginiana (~72-86 Mya),V. ursinus -M. melanurus (~56-64 Mya),V. ursinus -P. cinereus (~31-39 Mya),M. domestica -D. virginiana (~24-39 Mya),
and S. harrisii -M. melanurus (~4-22 Mya).
For comparison, phylogenetic trees of marsupials in the marsupial orders
Dasyuromorphia and Didelphimorphia was obtained by querying the
PHYLACINE (The Phylogenetic Atlas of Mammal Macroecology) resource
(Faurby et al., 2018).