Identification of GC-rich regions and homopolymers in rAAV
vector sequence
GC-rich regions were identified in the recombinant AAV2/8-CAG-GFP vector
sequence using the program NTContent
(http://github.com/emlec/NTContent ), included in the
SSV-Conta package. The following parameters were used: window size, 200;
step size, 20 or window size, 50; step size, 25. Mononucleotide repeats
composed of at least six nucleotides and simple sequence repeats (SSR)
were localized along the AAV vector genome using the MISA-web server
(https://webblast.ipk-gatersleben.de/misa/)
(Beier et al., 2017) and the following parameters: SSR motif length/min.
no. of repetitions, 1/6, 2/2, 3/2, 4/2, 5/2, 6/2, 7/2, 8/2, 9/2, 10/2
and max. length of sequence between two SSRs to register as compound
SSR, 100, output file parameter, GFF.