Methodologic differences and sources of bias
There are known sources of bias in metagenomic studies associated with analysis methods.2,12,13 Brooks and colleagues determined that microbiome community composition can be biased at various steps including DNA extraction, PCR amplification, sequencing, and taxonomic classification.12 The researchers reported that more than 85% of the microbiome community results were biased by small variations in methods (less than 5%). Specifically, it has been determined that variations in DNA extraction and amplification can result in decreased representation of Streptococcus species. If women with GBS colonization only harbor small amounts of the bacteria, it is possible that bias introduced by these steps could have contributed to the limited representation of GBS in vaginal microbiome data.
Such underrepresentation of known pathogens can significantly limit the clinical scope of vaginal microbiome studies. For example, many studies only reported results related to community state types or analyzed the most abundant species. Consequently, if thresholds for inclusion are based on sample abundance (e.g., 5% of sample or 5,000 reads), GBS data may have been excluded if it did not meet the threshold.
Furthermore, selection of 16S regions for amplification and associated primers can influence genus and species level resolution.14 For example, it is generally more difficult to distinguish Lactobacillus species within the V4 region, but may be easier to do so in V1-V3.15Conversely, G. vaginalis is more difficult to detect and differentiate within the V1-V2 region because its sequence is highly variable prior to the V1 region, making primer selection more complex.14 These biases can be partially mitigated by recognizing the characteristics of 16S rRNA sequencing and thoughtfully selecting primers. Unfortunately, studies that evaluate the influence of selected 16S regions on GBS representation have not been published yet. From our findings, the 13 studies that detected GBS used 16S regions between V1-V4 or V6. However, the 32 studies that did not report GBS also used the same regions. Thus, assessment of bias by 16S region is important when designing and evaluating future vaginal microbiome studies.
Given limitations in exploring the true proportions within the microbial community, studies have increasingly employed whole genome sequencing (WGS), which allows for higher resolution up to the strain level. In our review, two studies employed WGS but did not report GBS.16,17 Both papers initially selected some of the most frequent species-level taxa, which are frequentlyLactobacillus and Gardnerella species, to continue further analysis. This approach may have constrained the strength of WGS, causing researchers to miss the rarer, yet clinically important genus, species, and strains.