TiLV genome dataset
Individual genome segments of TiLVs isolated from eight infected fish
samples were RT-PCR amplified, cloned, and sequenced. The samples were
collected in Thailand between 2014–2019. The sequences are available
from the GenBank database (accession numbers: MN697695–697774). We were
able to obtain complete genomes for all eight of them; each comprised 10
segments, termed Segment 1 to 10 according to their sizes (Segment 1
being the largest; Figure S1 ). Together with nine previously
published TiLV complete genomes, we estimated the past transmission
history and origin of TiLV. The information of all 17 complete genomes
used in this study is summarised in Table S1 .
Reassortant detection
analysis
A recent study showed that TiLVs can and do undergo reassortment process
(Chaput et al., 2020). This means that different genome segments may
have different evolutionary histories, and analysing multiple genomic
regions with conflicting evolutionary signals together can result in an
erroneous estimation of the virus evolutionary history.
Potential conflicting evolutionary signals were checked in a
concatenated nucleotide alignment of segments 1–10 by using several
programs (see Materials and Methods ). A few reassortants were
detected in our dataset (Figure 1 ). The detected signal
breakpoints coincided well with the segment boundaries. Our results
showed that segments 1–4 shared the same evolutionary history, while
segments 5 and 6 had their own distinct histories. Segments 7–10 also
shared a common history, but which was distinct from those of others.
The concatenated alignment was thus split accordingly (Table
S2 ).
Our analysis identified three potential reassortment events: one
involving exchanging of segments 1–4 between (the ancestors of)
TH-2016-CN isolate and an unknown TiLV, another one involving exchanging
of segment 5 between (the ancestors of) TH-2018-K and BD-2017, and the
other one involving an exchange of segments 5 and 6 between (the
ancestors of) IL-2011 and EC-2012. These results were consistent with
the findings previously reported by Chaput et al. (2020). They compared
tree topologies estimated from individual genomic segments, and found
evidence for reassortment of segments 5 and 6 involving IL-2011 and
EC-2012, and that BD-2017 might be involved in a reassortment event.
This suggested a role of reassortment in emergence of new combinations
of TiLV genomic materials, similar to the one well-established in its
distant relatives, influenza viruses (Steel and Lowen, 2014).
Phylogenetic analyses of
TiLVs
Chaput et al. (2020) analysed each individual genome segment separately,
and yielded phylogenetic trees with high estimation uncertainties
(Chaput et al., 2020). This was likely because each segment on its own
had relatively low phylogenetic signals. In this study, we therefore
grouped the segments based on their shared evolutionary history, and
analysed them together. This allowed us to construct relatively longer
nucleotide alignments with more phylogenetically informative nucleotide
positions (Table S2 ), and in turn gave us more power to resolve
the phylogenetic relationships and yield more robust evolutionary
estimates (Figure 2 ). Our sequence collection was also
“time-stamped”, enabling us to perform tip-dating analyses. In this
study, we applied a Bayesian phylogenetic framework to simultaneously
co-estimate the TiLVs’ phylogenetic relationships, divergence dates, the
rate of evolution, and their population dynamics.
The temporal signal in each sequence dataset was first quantified using
root-to-tip regression analysis (Rambaut et al., 2016), and we found
that all datasets had a statistically significant signal (p <
0.05; Table S3 ), sufficient for tip-dating analysis. Across all
four genomic regions, our analyses consistently estimated the time to
most recent common ancestor (tMRCA) of all TiLVs to be between
2003–2009, with the rates of evolution ranging between
~1.81–3.47×10-3 substitutions per
site per year (s/n/y) (Figure 2 and Table 1 ). These
rate estimates fell within the expected range RNA viruses’ rate (Sanjuán
et al., 2010). The analysis of segments 1–4 yielded the most precise
date estimate of around mid 2008 (95% highest posterior density (HPD)
interval = 2007.5–2009.6). Segment 6 gave the oldest estimate in 2003,
but it was also the most uncertain one (95% HPD interval = 1992–2008).
This might be indicative of a low, but detectible, temporal signal in
the data. Alternatively, this could genuinely reflect its unique early
origin and reassortment history. Population dynamics of TiLV estimated
from segments 1–4 and 7–10 yielded a potential peak in the population
size around 2014–2016, followed by a continuous decline in the
population from 2016 onwards. Analyses of segments 5 and 6 gave similar
results, although the peaks and the declines were much more modest
(Figure 3 ).
Similar estimates of the dates, rates, and demographic histories were
obtained under the relaxed clock model (Figures
S2 –S3 , and Table S4 ). Indeed, our analyses did not
favour the relaxed clock over the strict clock for all analyses
(Table S4 ; log Bayes factor < 3), suggesting
relatively constant evolutionary rates over time. Analyses assuming the
same evolutionary rate for all alignments based on the Skygrid
coalescent model under the strict clock assumption yielded a rate
estimate of ~3.0 (2.2–3.4) ×10-3s/n/y, a tMRCA estimate of 2007–2008, and a population size trajectory
that peaked around 2014 (Figures S4 –S5 , andTable S5 ), again, all comparable to those from the main
analyses. The estimated phylogenetic relationships were also consistent
with those obtained from the main analyses. Finally, molecular clock
analyses of larger datasets for segments 1, 3, 5, and 9 yielded largely
consistent estimates of tMRCA, evolutionary rates, and demographic
histories (Figures S6 –S9 , Table S4 , andTable S7 ). Altogether, these analyses showed that our date, and
rate estimates, as well as demographic histories were robust to the
choice of model assumptions.
The estimated tMRCA of 2003–2009 matched well with the first report of
a sharp decline in tilapia populations in Israel from 2005 (Eyngor et
al., 2014), and the reported mass die-offs in farmed tilapia in Ecuador
and Israel staring in 2009 (Bacharach et al., 2016). This also suggested
the presences of TiLV in Israel and Ecuador many years before the first
report of the virus outbreak in 2014, which, incidentally, was also
around the time when the virus population was at its peak
(Figure 3 ). Moreover, the analyses consistently showed that the
TiLV population steadily declined from 2016 onwards, 1–2 years
following the first report of the virus. This could be a result of
natural resistance and herd immunity building up in the fish population
against the virus, and / or a reflection of a better awareness of the
virus, and a better-prepared and more cautious Tilapia importation
protocol following the first virus report.
Examination of the four phylogenies revealed complex histories of TiLVs
(Figure 2 ). We found that TiLVs from Thailand, including our
new samples, often appeared to cluster together, but not always as a
single group. They formed a well-supported cluster in the Segment 1–4
tree (clade support = 0.99), and the same was seen in the Segment 7–10
tree, but the grouping was not well supported (clade support <
0.80). In contrast, the virus appeared to split into several
well-supported clusters in the Segment 5 and 6 trees; however, due to
the low support of deep and long branches, it was therefore perhaps too
premature to make any detailed inferences about how TiLVs emerged or
were introduced to Thailand. Our analyses, nonetheless, dated the tMRCAs
of Thai TiLVs to be between 2008–2011 for the well-supported group in
the Segment 1–4 tree. Although TiLV in Thailand was not reported until
2017 (Dong, Siriroob et al., 2017; Surachetpong et al., 2017),
retrospective diagnosis of archived fish specimens revealed the TiLV
must have been present in Thailand before 2012 (Dong, Ataguba et al.,
2017), consistent with and further supporting our results. In addition,
analysis of segment 6 robustly inferred TH-2014 to be most closely
related to Middle East and South American isolates (clade support =
0.99), while it was found to group with other Thai TiLVs in the Segment
1–4 tree (clade support = 0.99). This suggested that reassortment could
indeed happen, even between viruses that are geographically very far
apart.
Regarding TiLVs from other countries, the two USA isolates from a
tilapia farm in Idaho were consistently shown to be embedded within the
clade of Thai TiLVs, having relatively short branches. In particular,
they were found to have a clear grouping structure with two Thai
isolates, namely TH-2018-N and TH-2019, forming a small clade in all
trees (clade support = 1). Their MRCA was dated to be around 2016–2017.
This coincided with a recent record that the farm in the USA imported
live tilapia from Thailand (Ahasan et al., 2020).
Similarly, the Bangladesh isolate, BD-2017, was inferred to be
clustering with Thai isolates in all trees, except in the Segment 5
tree, which showed that it was most closely related to Middle East and
South American TiLVs although not well supported. The youngest branching
date of the BD-2017 was estimated to be 2012 (segment 6; 95% HPD
interval = 2006–2015). Assuming that there was a single introduction of
the virus to Bangladesh, this result suggested that TiLV was introduced
to Bangladesh after the early 2010s. All of these findings were
perfectly congruent with an earlier forecast (Dong, Ataguba et al.,
2017), and the results previously reported by Chaput et al. (2020).
Regarding the four Middle East and South American TiLVs, they always
appeared to form a group in all trees. Specifically, TiLVs from Ecuador
and Peru were inferred to be sisters in all trees, except in the Segment
6 tree, which supported a sister relationship between the Ecuador
isolate EC-2012 and the Israeli isolate IL-2011 (Til-4-2011). In
contrast, the two isolates from Israel (Til-4-2011 and AD-2016) did not
form a clade, and were often found to be the most basal ones in the
group. The only exception was that they formed a clade in the Segment
7–10 tree, but then again, the clade was not well-supported, and it was
placed relatively deeply towards the root. Together, these findings
suggested a single introduction of TiLV into South America perhaps from
the Middle East, with an estimated timing around 2010–2013.
Our results were inconclusive about the likely geographical origin of
TiLV as a whole. The Segment 1–4 tree showed a well-supported cascading
structure of Israeli TiLVs basal to all other isolates, and the most
parsimonious interpretation of this was that TiLV originated in Israel
or nearby regions. We noted that, in the Segment 5 and 6 trees, the
isolates from Israel (and South America) were nested within Thai
isolates, suggesting that this might not be the case; however, the
support of such pattern was lacking. In fact, our analyses of all
segments apart from segments 1–4 consistently showed deep and long
branch structures towards the root of the trees. Coupled with the facts
that deep branching structures were not well-supported, and that the
samples were quite sparse, it was unclear where TiLV actually
originated. Additional sequences from diverse locations, and sampling
times are thus required to conclusively determine the geographical
origin of TiLV.