2.7 Phylogeny and divergence time estimation
Marsupials (metatherians) and eutherians diverged ~160 million years ago, long before the radiation of extant eutherian clades (~100 million years ago) (Luo, Yuan, Meng, & Ji, 2011; Phillips, Bennett, & Lee, 2009). We employed the platypus (Ornithorhynchus anatinus ) as an outgroup as this species is frequently used as an outgroup for both eutherian and marsupial mammals and it has a high-continuity, well-annotated genome (contig N50 15.1 Mb; scaffold N50 83.3 Mb). In contrast, the genomes of the most closely related eutherian mammals, the edentates (sloth, armadillo and anteaters) currently have relatively poor-quality genome assemblies. These include, for example, the armadillo Dasypus novemcinctus(contig N50 0.03 Mb, scaffold N50 1.7 Mb) and the sloth Choloepus hoffmanni (contig N50 0.06 Mb; scaffold N50 0.4 Mb).
We identified 7,116 high-confidence 1:1 orthologs by interrogating the predicted proteins from the gene models of 11 species (nine marsupials and the platypus) using SonicParanoid v1.3.0 (Cosentino & Iwasaki, 2019). Because the A. arktos , A. argentus , and M. murexia pseudo-genomes were derived from A. flavipes , their genes were extracted using the A. flavipes gene annotation file. The corresponding coding sequences (CDS) for each species were aligned using PRANK v100802 (Loytynoja & Goldman, 2005) and filtered by Gblocks v0.91b (Talavera & Castresana, 2007) to identify conserved blocks (removing gaps, ambiguous sites, and excluding alignments less than 300 bp in size), leaving 7,116 genes. Maximum-likelihood (ML) phylogenetic trees were generated using RAxML v7.2.8 (Stamatakis, 2006) and FastTree v2.1.10 (Price, Dehal, & Arkin, 2010) with three CDS data sets: the whole sequence, first codon positions, and fourfold degenerate (4d) sites. Identical topologies and similar support values were obtained (1,000 bootstrap iterations were performed). The divergence time between species was estimated using MCMCTree [Bayesian molecular clock model implemented in PAML v4.7 (Yang, 2007)] with the JC69 nucleotide substitution model (Jukes & Cantor, 1969), and 4d ML tree and concatenated supergenes of first codon positions and fourfold degenerate (4d) codons as inputs. We used 100,000 iterations after a burn-in of 10,000 iterations. MCMCTree calibration points (million years ago; Mya) were obtained from TimeTree (Kumar, Stecher, Suleski, & Hedges, 2017):O. anatinus -P. cinereus (~167-192 Mya),V. ursinus -D. virginiana (~72-86 Mya),V. ursinus -M. melanurus (~56-64 Mya),V. ursinus -P. cinereus (~31-39 Mya),M. domestica -D. virginiana (~24-39 Mya), and S. harrisii -M. melanurus (~4-22 Mya). For comparison, phylogenetic trees of marsupials in the marsupial orders Dasyuromorphia and Didelphimorphia was obtained by querying the PHYLACINE (The Phylogenetic Atlas of Mammal Macroecology) resource (Faurby et al., 2018).