3.3. Whole genome and ORF 1 a/b sequencing analysis indicates
that Peruvian PDCoV strain originated from an US PDCoV strain
The nucleotide sequence of Peruvian PDCoV strain, identified as
PDCoV/Peru/isolate/2019, was submitted to GenBank under the accession
number MT227371. Our Peruvian
PDCoV genome follows similar patterns with other PDCoV genome sequences
deposited in GenBank. Thus, this strain is 25,501 nt in length and
consists of, excluding the polyA tail: 5’-UTR (1-480 nt), ORF1a/b
(481-11368 nt, 11368-19283 nt), S (19265-22747 nt), E (22741-22992 nt),
M (22985-23638 nt), NS6 (23638-23922 nt), N (23943-24971 nt), NS7
(24037-24639 nt) and 3’-UTR (24972-25501 nt). A graphical representation
of the characterized PDCoV strain is shown in Figure 2.
Phylogenetic analysis has typically been performed using key major genes
of any organism of interest. However, this analysis tends to limit the
analysis to a certain gene or group genes. Conversely, whole genome
sequencing offers a more complete and deeper genetic characterization
compared to partial approaches. In our study, we took advantage of next
generation sequencing of our PDCoV strain to track its evolutionary
origin. Our results indicated that our Peruvian strain belongs to the
North American phylogroup and is closely related to a PDCoV strain from
the US isolated in 2015 (99.5% of nucleotide identity). Genetic
distance of the Peruvian PDCoV strain with other PDCoV analysed reveals
high similarity between 97.1 and 99.5%. Compared to the US strains, the
Peruvian PDCoV has a nucleotide identity between 99.45 - 99.51%.
Percentages range from 98.6 to 98.74% when compared to the Chinese
strains. Finally, nucleotide identity is 97% and 97.5% for Thai and
Vietnamese strains, respectively. A summary of nucleotide identity is
shown in Table 2. Further analysis based on ORF 1 a/b showed identical
topology to the whole genome sequence phylogenetic tree. Altogether,
these results indicate that the virus detected in Peru has emerged from
a North American ancestor (see Figure 3A and 3B). Similarly, PDCoV
protein sequence analysis resembled the topology of the nucleotide
analysis.