3.3 Analysis of nucleotide sequence of the cap genes and amino acid sequence of the Cap proteins
To explore the genetic diversity of PiCV strains, the cap genes of 90 PiCV strains from the 120 PiCV positive samples were sequenced. Detailed information for the 90 cap genes was shown in Table S2. The 90 cap genes were used to compare with the reference sequences from China and other countries, respectively. The results showed that the 90 cap genes ranged from 813 to 828 nt in length. Fourteen, thirteen, one, fifty-three, seven and two of the 90cap nucleotide sequences were 813 nt, 816 nt, 819 nt, 822 nt, 825 nt, and 828 nt in length, encoding a Cap protein of 270, 271, 272, 273, 274, and 275 residues, respectively (Table S2). In addition, we found that ATT and GTG also existed in the position of the start codon site. The sequence comparison of the 90 identified cap genes revealed nucleotide homologies of 71.9%–100% and deduced amino acid homologies of 71.7%–100%, and the cap genes exhibited low sequence similarities with Chinese PiCV reference strains (73.0%–99.6% nucleotide identity, 72.3%–100% amino acid identity) and the PiCV reference strains from other countries (68.8%–98.4% nucleotide identity, 63.6%–100% amino acid identity) (Table S3).
To investigate variations in the deduced amino acid sequences, the amino acid sequences of 90 identified Cap proteins and the reference strains were aligned. The results showed that there were nine major locations of deletion (compared to the consensus sequence) among the Cap proteins including locations 7, 24, 29, 30, 35, 58, 130, 182, and 266 (Figure 1). A comparison of entropy (Hx) in amino acid sequences of Cap proteins showed that the Hx of amino acid sequences at most positions from the 90 identified Cap proteins were higher than that of the PiCV reference strains (Figure 2). In addition, some unique amino acid substitutions at 28 different positions were observed among the 90 Cap proteins as shown in Figure 3.
3.4 Analysis of nucleotide sequence of the repgenes and amino acid sequence of the Rep proteins
The rep genes for 68 out of 120 PiCV positive samples were successfully sequenced. Detailed information was shown in Table S2. The 68 identified rep genes were used to compare with PiCV reference sequences from China and other countries. The results showed that all of the 68 rep genes used the ATG start codon and were identified as two sizes: 948 and 954 nt. Forty of the 68 rep nucleotide sequences were 948 nt in length, encoding a Rep protein with 315 residues. Twenty-eight of the 68 rep nucleotide sequences were 954 nt in size, encoding a Rep protein with 317 residues (Table S2). The difference in size was due to a 2 amino acid deletion at positions 2 and 3. The sequence alignment revealed nucleotide homologies of 90.3%–100% and deduced amino acid homologies of 92.7%–100% among the 68 rep genes. These sequences exhibited higher similarities with PiCV reference strains from China (89.0%–99.2% nucleotide identity, 89.2%–99.6% amino acid identity) and other countries (89.5%–98.3% nucleotide identity, 90.5%–99.3% amino acid identity) as compared to the cap genes (Table S3). To investigate variations in the deduced amino acid sequences of rep gene products, the amino acid sequences of 68 identified rep genes and the reference strains were aligned. The results showed that some unique amino acid substitutions at 36 different positions were observed among the 68 identified PiCV strains (Data not shown).