The Gene Content of a Barcode Gap
The pattern of little variation within a species but substantial variation between species is the reason that DNA barcoding is proposed as a useful tool for taxonomists (Meyer and Paulay 2005). But what, specifically, are the fixed differences in nucleotide sequences that create barcode gaps? In vertebrates, including birds (Kerr 2011), mammals (Tobe et al. 2010), and fish (Ward and Holmes 2007), variation in amino acid sequence is rare in the barcoding region of the cytochrome c oxidase subunit one (COX1) gene. Thus, the barcode gap that is commonly observed using the conventional COX1 barcode gene (Kerr et al. 2007; Tavares and Baker 2008) is comprised almost entirely of synonymous nucleotide changes, and there is evidence for strong purifying selection on the nonsynonymous nucleotide positions within the COX1 barcode gene (Stewart et al. 2008; Kerr 2011; Popadin et al. 2013). In contradiction to the prediction that adaptive evolution of the COX1 gene might underlie the evolution of DNA barcode gaps (Hill 2016), there is too little variation in the amino acid sequence of the product of the COX1 gene between sister taxa for this prediction to be correct (Kerr 2011). The paradox of the COX1 barcode gene is that, despite departure from expectations of neutral theory, there seems to be little opportunity for adaptive divergence creating the differences among species in the nucleotide sequence of the COX1 barcode gene (Kwong et al. 2012). Certainly there are a handful of very well documented cases of COX1 adaptively diverging between sister taxa in response to changes in the oxygen pressure (Scott et al. 2011; Luo et al. 2013; Tomasco and Lessa 2014) or hydrogen sulfide exposure (Pfenninger et al. 2014) in the external environment. Such adaptive divergences in COX1 genotype, however, cannot account for the barcode gap that has been documented between thousands of sister taxa.
A paucity of non-synonymous changes in the barcode region of the COX1 gene is not difficult to explain. COX1 is the least changeable gene in the entire mitochondrial genome (da Fonseca et al. 2008; Kerr 2011). The conserved nature of COX1 is a major reason that it was chosen as the barcode gene: primer sets developed for model species tend to work for non-model species (Hebert et al. 2003a). COX1 is one of thirteen protein subunits of Complex IV of the ETS, which is the rate-controlling enzyme in the OXPHOS system (Pacelli et al. 2011; Arnold 2012), and COX1 holds the key catalytic position of that crucial enzyme (Wang and Pollock 2007). Thus, Complex IV is a particularly critical enzyme in animal systems that depend on aerobic respiration, and the barcode gene, COX1, is the most critical subunit of this most critical enzyme (Pierron et al. 2012). I propose that the resolution of this paradox of species-specific variation in the COX1 barcode sequence without functional changes in the COX1 gene lies in the tight linkage of genes on the mitochondrial chromosome, and genetic hitchhiking of neutral substitution in the COX1 barcoding region with adaptive changes in other regions of the mt genome (Meiklejohn et al. 2007).