Discussion
We investigated here the origin of SARS-COV2 virus that created pandemic in all over the world with considerable morbidity. In near future the chance of getting a vaccine is far from reality and much more morbidity is expected. It is imperative that currently people must depend on the various medicines only with a trial and error basis. To develop effective vaccines and medicines and for testing them in animals, it is necessary to know the origin of this virus and their intermediate host if any existed before its emergence as a major human infecting virus.
Among the three possibilities predicted earlier whether SARS-COV2 directly came from RaTG13 from bat using defective entry-point residues in RBD with poor infective power with less efficiency as it is observed in civet cats (7) or forceful infection in mice (18) and remained silent for long time but highly adapted to replicate with slower mutation rate and survive in a specialized immune system of the human body. Eventually, the entry point residues have been modified and perfected to attain widespread infectivity; or it gained the efficient entry-point mutations first to bind with ACE2 receptor to enter human body and then it perfected itself to adapt with higher mutation rate and evade the human immune system; or it entered to an intermediate human like host from bat with defective entry-point residues and adapted long time, then entered human host recently and survived easily with optimum mutation fitness. In all cases, after adaptation in human host, it gained more virulence by further substitution followed by selection pressure.
Our analysis indicates the occurrence of extremely low frequency of SARS-COV2 mutation in the human host. Mutation frequency can be confounded by selection and genetic drift. In optimal mutational fitness, mutation frequency is generally biased towards nonlethal mutations and most mutations are either beneficial or neutral, thus may dramatically underestimates mutation frequency. In that case mutation rate could be lowered as the deleterious mutation drives the mutation rate lower (13). Between two models as speed vs adaptability of viral mutation rate, here it appears that SAR-COV2 evolution fits with adaptability model. Adaptability model states that after a long adaptation to evade immune system, the selection pressure is relatively low and the supply of beneficial mutation frequency is reduced, thus population favors a low mutation rate. When the mutation reaches to an optimum level simply because selection is acting on it long time within the context of immune escape to reach the maximum mutation fitness (13, 14, 19, 20).
If SARS-COV2 has come directly from bat as it is presumed, it would take a very long time to evolute as a present-day SARS-COV2 virus in the human host. Only assumption that permits this kind of viral association in the human respiratory tract by staying as a silent virus and then gained the virulence after a long time of adaptation. In the last 6 months the emergence of a new strain with more infective power has been demonstrated (10). Such a creation of a strain with more infective power also suggests that SARS-COV2 might not reside in the human host very long time without revealing its existence even in very mild form when human immune system tend to defeat its very existence.
However, our analysis has some limitations. It is unknown why the mutation rates are almost double (4.89nt) in April than other previous months. A biased sampling of a particular variant strain could represent repetitively over other low mutating strain or inclusion of a single genome consists of a 17base deletion or as expected by increasing generations in April than previous months for widespread infectivity. Although, the continuation of this increasing trend could not be verified due to unavailability of SARS-COV2 genomic sequences beyond April. Also, we wanted to assess here the average mutation rate in SARS-COV2 virus and not a strain specific by assuming all strains are capable of infecting human efficiently and undergoing substitutions to evolute to become a better strain. We also did not separate out the synonymous or nonsynonymous mutations although nonsynonymous mutation selection would have been much stringent. Another important consideration is that we did not observe any recombination or big insertions (except one that is collected in Washington in April) in these four months and frequent occurrence of those could increase the mutation rate that can occur any time. However, such an event could be very rare in an optimally mutationally fitted virus and may not add much weightage in overall mutation rate in the long run. Lastly, we estimated the mutation rate of SARS-COV2 in human host but extended it to calculate the time taken by this virus from bat RaTG13. Although such an estimation may need extensive experimental study in bat system as there would be different selection pressure than human. Nevertheless, to take less time to evolute in bat than human (<30years) could presume bat system must have higher mutation rate than human which further assume that it has to face much more challenging environment in bat than human but that is not expected as RaTG13 is native (long time adaptation) to bat.
Among the key entry-point residues in SARS-COV2 455L, 459Y and 500T are same in both RaTG13 and Pan_SL_COV_GD, thus they can come from any of them. The most important residue for SARS-COV2 interaction with human ACE2 is K353 that binds with 501N and can evolute by conversion of D (aspartic acid) to N (aspargine) by a single nucleotide mutation (G>A). It is also to be noted that a single nucleotide mutation almost gave RaTG13 a passport to infect human efficiently.
But 493Q needs mutation in two nucleotides in 1st(U>C) and 3rd codon (U>A) sequentially either it would generate a nonsense codon (UAA). Again if 1st codon mutation occurred before 3rd codon it would code Histidine (H) by CAU. Thus, 493H carrying intermediate ancestor of SARS-COV2 virus must exists in bat or pangolin or in any other intermediate host.
However, although a genetic drift might come into play in these conversion from Y >H>Q but such a drift can occur only after entering it into human or intermediate host. The silent presence of SARS-COV2 related virus is not documented in human for long time. Also, no evidence has supported the notion that any such primate population are endangered/suffered due to a recent viral attack. The mutation must be inside a host but there is a possibility that this intermediate host no longer exists (wiped out) any more in nature or yet to be explored. Also, with the current genomic and amino acid sequences of SARS-COV2 having 493Q and 501N in the RBM suggests that SARS-COV2 could infect any of the primate or higher order mammals as intermediate host having K31 and K353 residues in their ACE2 receptor gene. But till date, none of them are shown to naturally harbor SARS-COV2 or any closely related virus. Li et al (2004) (4) also suggested that such an intermediate host can never be identified. Although, it is impossible to conclude that such an intermediate host can never be found, a systematic investigation can be continued to search for such a host.
Taken together, our analysis do not satisfy any of these conditions such as absence of any evidence of silent presence of SARS-COV2 virus in human for a long time that would take approximately 30 years to evolute as a present day SARS-COV2 or very high mutation rate or a must needed intermediate host carrying intermediate virus with 493H. Taken together, the absence of any intermediate host or virus between bat and human and inability to stay long time silently in human host also can lead to believe that SARS-COV2 would have been more easier to be created unnaturally.