INTRODUCTION
The adeno-associated virus (AAV) is widely used as a viral vector to deliver therapeutic DNA. With the success of clinical trials using recombinant AAV (rAAV), the regulatory bodies have increased the level of requirements regarding the quality control (QC) of these new drugs. In particular, the presence of residual DNA in the final product is of significant concern due to the potential risk of oncogenicity, immunogenicity and decrease in gene transfer efficiency (Wright, 2014). The consequences of co-injecting DNA contaminants in patients with vectors depends on multiple criteria, such as; the type, the nature (i.e. free or encapsidated, fragmented, unmethylated) and the quantity of DNA impurities. To limit these risks, the Food and Drug Administration (FDA) recommends a level of residual host cell DNA (HCD) below 10 ng per parental dose (Food and Drug Administration, 2012), which might be difficult to not exceed in some cases, for example when high dose of AAV vectors is required to reach the therapeutic effect, such as for the treatment of Duchenne muscular dystrophy (Crudele and Chamberlain, 2019) or Spinal muscular atrophy (Al-Zaidy and Mendell, 2019). Quantification of HCD is most often based on real-time PCR, a targeted technique that only analyze a few numbers of DNA species. In addition, it is subjected to high variability due to a lack of harmonization (Ayuso et al., 2014; Dorange, F and Le Bec, C, 2018). To provide the community with a more exhaustive QC assay, our laboratory reported the Single-Stranded Virus Sequencing (SSV-Seq) method for the analysis of residual DNA in AAV vector stocks (Lecomte et al., 2019). SSV-Seq is based on Illumina high-throughput sequencing (HTS) and allows to identify and quantify all DNA impurities that are co-purified (encapsidated or not) with rAAV particles. The protocol has been adapted for the analysis of AAV vectors generated either by plasmid transfection of HEK293 mammalian cells (Lecomte et al., 2015) or baculovirus infection of Sf9 insect cells (Penaud-Budloo et al., 2017). Using this method, we showed that DNA impurities mainly originate from the vector plasmid or the baculovirus genome for the HEK293- or Sf9-based manufacturing platform, respectively, with a predominance of residual sequences proximal to the inverted terminal repeats (ITR) (Penaud-Budloo et al., 2018a). In addition to the relative percentage of each DNA species, SSV-Seq can provide information regarding vector genome identity with a computational analysis of the single nucleotide variants (SNV) and the sequencing coverage over the recombinant AAV genome. Since then, and as sign of interest for HTS-based methods applied to rAAV quality control, other protocols have been developed to analyze the identity (Guerin et al., 2020; Maynard et al., 2019) or integrity (Radukic et al., 2019; Tai et al., 2018; Xie et al., 2017) of AAV vector genomes by sequencing.
In this study, we show that a high GC content and the presence of homopolymers in the AAV vector genome impaired the efficiency of PCR amplification during the preparation of sequencing library, leading to a decrease in coverage in the SSV-seq protocol. To solve this issue, we have optimized the library preparation using a PCR-free protocol. The novel method, SSV-Seq 2.0, has been used to analyze a vector genome harboring a CMV early enhancer/chicken beta-actin (CAG) promoter, that is well known to be a difficult template for PCR and sequencing. HTS-based assays represent the most exhaustive way to control the AAV vector quality and purity, and to fulfill the regulatory agencies requirements in term of residual nucleic acids.