To meet the challenges in the increasing food demands and diverse breeding targets in plants and animals, molecular markers have been widely used to reveal polymorphism at the DNA level, and these markers have been further applied to identify marker-trait associations, marker-assisted selection, genomic selection, germplasm characterization, etc (Singh 2015, Xu 2008). The molecular markers like Restriction Fragment Length Polymorphisms (RFLPs)\cite{Botstein1980}, simple sequence repeats (SSR)\cite{Tautz1989} have been developed and used for around 40 years ago. However, these platforms have very limited throughput with hundreds of loci on hundreds of lines. Using fluorescence hybridization-based microarray or next-generation sequencing based genotyping platform, the throughput of genotyping increase up to tens of thousands loci on an almost unlimited number of lines. As the development of next-generation sequencing, the most commonly used approaches include restriction-site associated DNA (RAD)\cite{Miller2007}, genotyping-by-sequencing (GBS) \cite{Elshire2011} and Specific locus amplified fragment sequencing (SLAF-seq)\cite{Sun2013}. However, these simplified genome-wide sequencing platforms has a high missing rate and under-calling of heterozygous sites rate in highly diverse and heterozygous species. In the previous study, we have found that the amplicon sequencing (AmpSeq) platform solved the problem of high-missing rate and under-calling of heterozygote site. The remaining problem for the highly diverse and heterozygous species is that the marker transferability. For example, the breeding practice for hardwood in Eucalyptus include species that diverged 2 to 5 Mya\cite{Grattapaglia2011}, the grape breeding often includes species that diverged around 20 Mya \cite{Vezzulli2008}, therefore, a universal trans‐species [interspecies? cross-species?] molecular marker panel is highly demanded. In addition, the universal marker panel is very efficient in genotyping non-model organisms that have a well-studied closed related species.  
The transferability problem is mainly due to the fact that majority of the genetic variants are rare. Therefore, it is expected that the genetic variants are specific to individuals, populations or species. Common variants have been used in marker development, but the transferability is still not satisfactory.  The transferability problem can be further broken down into three degrees: 1) The genotyping of the marker is failed in a different population or individual. There are many reasons for that, for example, when the primers fail to bind to the target sites due to mismatches between the primer and the candidate region, no data will be returned. In the restriction enzyme digestion or low depth next-generation sequencing involved approaches, missing rate is high due to the randomness in the digestion or sequencing. In a study using three SNP chips (BovineSNP50, OvineSNP50, and EquineSNP50) to genotyping the close related species diverged from less than 1 MYA to 50 MYA, the authors found an average 1.5% increase in genotyping failure rate per million years of divergence time \cite{Miller2012}. 2) No polymorphism in a different population. For example, for the most widely used SNP genotyping array in maize, the Illumina maize SNP50 BeadChip, only 17% to 33% of the marker are polymorphic in the population made from European maize inbred lines among the 49,585 high-quality markers designed mostly for the temperate germplasm\cite{Bauer2013}. And this problem is more evident in the highly diverse and heterozygous species. The transferability drops to 2.3% when transferring the marker between species in the Vitis genus (Vezzulli 2008).  The same issue has been reported in cattle\cite{Michelizzi2010,wu2013genome}, only 2% of markers are polymorphic when applying the panel designed for cattle to water buffalo (diverged ~12MYA). Miller et al discovered a pattern of the retention of polymorphisms using a series of species that share a common ancestor, and they found that the retention of polymorphisms follows an exponential decay as the divergence time increase. The polymorphisms decreased to only 5% when examined in a species that diverged 5 Mya \cite{Miller2012}. 3) Variable genetic position or different physical position due to fast linkage disequilibrium decay and large structural variations in the genome. For example, for the same sex-related QTL, the significant markers are different in the four populations studied (Yang 2016).  
 
The aim of this study includes: (1) developing a pipeline to design transferable markers using the core genome of the Vitis genus. (2) testing the transferability of these markers using a high-fidelity rhAmpSeq platform, a new technology with improved specificity and the throughput of the original AmpSeq. In this study, we de novo assembled six genomes in the Vitis genus and collected three genomes from the public database, constructed a core genome based on syntenic genome alignment. 2000 markers that span all 19 chromosomes with average distance 200kbp were manufactured and tested in four population that represent the greatest genetic diversity in the US breeding practice. The results in the four populations indicate that the core genome plus rhAmpSeq platform could generate high polymorphic data with a very low missing rate. This pipeline provides an interspecies genotyping method that works for highly diverse and heterozygous species.

Materials and methods    

Genomic DNA extraction, library construction and sequencing