1. Introduction
In December, 2019 china reported a disease with pneumonia like
conditions, resulting in respiratory malfunctioning due to some viral
attack. Later that virus proved lethal and turned into global pandemic.
World Health Organization (WHO) named the disease as “coronavirus
disease 19 (COVID-19)”. Following the international standards of
nomenclature, virus was declared as severe acute respiratory syndrome
coronavirus 2 (SARS-CoV-2) due to its taxonomic and genomic relationship
with the species of sever acute respiratory syndrome-related
coronavirus[1].
In the initial stages of pandemic, it was centered to china only but
Spain, Italy, Brazil, France, United States of America, Iran, and India
were also severely affected in short period. World had seen the major
lockdown of the history in the year 2020 to reduce the spread of CoV-2
that had greatly affected the economy of world powers. Instead of
initial precautions taken by the people, the virus affected 108M people
around the globe with 80.8M recoveries and 2.31M deaths till January
2021. World had also previously experienced corona as MERS-CoV and
SARS-CoV that had affected Middle East and other countries to a large
extent in 2012 and 2002. Coronaviruses profoundly spread in humans,
other mammals and birds mainly affecting their respiratory, liver, and
intestinal and nervous systems[2,3].
Human coronaviruses (HCoVs) were first identified in the mid-1960s. Till
now seven HCoVs are known which include two α coronaviruses CoV-229E and
CoV-NL63 and four β coronavirus as CoV-OC43, CoV-HKU1, SARS-CoV,
MERS-CoV and CoV-2 [4,5]. As CoV-2 belongs to the formerly known
family of coronaviruses it holds on to structural formations and show
close genomic similarity to the SARS-CoV. The CoV-2 harbors a linear
single-stranded positive RNA genome rapidly infecting vertebrates, named
for the crown like spikes on their surface[6]. Subsequently after
crown like surface projections it has spike protein (S), envelope
protein (E), membrane protein (M), and nucleocapsid protein (N)[7].
These structural proteins are responsible for viral replication,
virion-receptor attachments and thus involved in pathogenicity,
spreading, and entry of virus into host organism.
Within a short period of time the virus shows its mutating ability,
giving rise to new resistant more pathogenic strains which could be more
difficult to counter. It may void the drugs designed against CoV-2 or
may reduce the vaccine efficacy due to large number of variants. Genomic
composition of CoV-2 shows 12 functional ORFs (open reading frames), 11
protein coding genes, with 12 expressed proteins and 5′ capped mRNA
consist of 38% GC content with poly-A tail at 3′ end followed by
UTR[6]. The ORFs are arranged on mRNA of CoV-2 as ORF1a, ORF1b,
Spike (S), ORF3a, Envelope (E), Membrane (M), ORF6, ORF7a, ORF7b, ORF8,
nucleocapsid (N), and ORF10 [8] (Figure 1). The genome of CoV-2
encodes 16 non-structural proteins (NSPs), four structural proteins, and
other polyprotein1a and polyprotein1b[9]. Among the NSPs, replicase
and protease are important for the viral genome replication, along with
structural proteins and also potential drug targets [7,10,11].
Although both, structural and non-structural proteins of CoV-2 are
important to investigate, here we investigated the variations, existed
only in structural proteins because of their potential drug targets and
vaccinal importance. This is the first comprehensive study in which we
screened 2,95,000 complete genomes of SARS-CoV-2 for variants in the
structural proteins. Exploring the degree of variations in the important
target proteins might be helpful in projecting the pathogenicity and
transmission of CoV-2 strains around world. Presence of large variations
may lead to the confirmational changes in the targets, leading to
therapeutic failure. Diagnostic accuracy may also be affected if proper
screening has not been performed. Alternatively, geographic strains
specific vaccine and antiviral might be more effective.