Identification of GC-rich regions and homopolymers in rAAV vector sequence
GC-rich regions were identified in the recombinant AAV2/8-CAG-GFP vector sequence using the program NTContent (http://github.com/emlec/NTContent ), included in the SSV-Conta package. The following parameters were used: window size, 200; step size, 20 or window size, 50; step size, 25. Mononucleotide repeats composed of at least six nucleotides and simple sequence repeats (SSR) were localized along the AAV vector genome using the MISA-web server (https://webblast.ipk-gatersleben.de/misa/) (Beier et al., 2017) and the following parameters: SSR motif length/min. no. of repetitions, 1/6, 2/2, 3/2, 4/2, 5/2, 6/2, 7/2, 8/2, 9/2, 10/2 and max. length of sequence between two SSRs to register as compound SSR, 100, output file parameter, GFF.