\cite{Barbará2007}. However, the transferability of high-throughput genetic markers are as low as 2% when transferring markers across species \cite{Vezzulli2008}\cite{Chagné2012}. In this study, 93% of markers return data for all four population we test. And also around 82% of markers are polymorphic in all the population. And the genetic map built for these four population are consistent, which indicating that the consistency of marker order and segregation pattern. Although for each population there are 10% to 20% with unexpected Mendelian segregation ratio, these markers are population specific. We compare the distance between distorted markers and a random sampling of the same number of markers from the entire set, the distance between distorted markers is significantly smaller than the random expectation ( Mann-Whitney test, p<1e-13). This reduced distance between distorted markers indicating that they are clustered on the chromosome, which indicating the distorted markers are linked. In addition, when we combine individuals from these four population to form a metapopulation, only XX% of markers are distorted. This further suggested that the majority of the markers are informative in constructing a genetic map, a small portion of them might fail due to different genetic background. We also found that in two population that the sex loci were located, the markers that explain the biggest phenotypic variation are the same, which indicated that not only the random markers are transferable, the functional markers are also transferable. In one word, we validated that the markers designed based on the genus-wide core genome and genome polymorphism are transferable at different levels, which includes the amplification level, polymorphism level, segregation level, and marker-trait association level.
The key to develop transferable markers is the construction of the genus-wide core genome considering the colinearty. Previously markers that designed based on resequecing a large number of samples has limited transferability. By DNA resequencing, rich information of small genetic variantion can be accessed, however large and complex structure variantion is often missing. The long collinear blocks conserved within the genus are suggestive of strong selction against structure variantion in this region, which increase the probabilty to identify markers that has consistant occurrance in the genome and also consistent segregation pattern. Out result indicate that in order to identify interspecies transcferatble markers, core genome with collinearity should be considered.
The advantage of rhAmpSeq genotyping platform for highly diverse and heterozygous species
In our previous study, we have found that Ampseq genotyping platform outperforms GBS or other NGS based genotyping platform for highly diverse and heterozygous species, due to limited missing data, increased coverage and accuracy at heterozygote sites, and elevated transferability among distinct species. Different from SNP array or KASP which target one specific polymorphism site, another advantage of AmpSeq genotyping platform is that it allows the identification of novel alleles and a short haplobolck because the entire amplified region (from 90 bp to 250bp long) are sequenced through NGS. Therefore, for a pair of individuals with a genetic diversity greater than 1 SNP per 250 bp, the amplified region should contain at least one SNP to distinguish the two individual, which increase the information content of the markers and make it suitable for a broad range of germplasm. This high coverage and unbiased sequencing of amplicons make this platform is applicable in population genetics and ecology studies.
Comparing to AmpSeq, the rhAmpSeq introduce an extra step to reduce the mismatch between the primer and the template DNA and reduce the primer dimer formation. ask IDT add.
Genus-wide pan-genome
As the explosion of the genomic data in the past decade, it has been awared that a single genome is insufficient to capture genetic variance of one species. Mutiple genomes provide a comprehansive understanding of complex structural variations, lineage specific genes and large effect variantions within one species . In this study, we constructed a pan-genome for the whole vitis genus, which diverged approxinately 28 million years ago. Differnt from the pan-genome that constructed for one specie or close related specie, we expect the core genome for the Vitis genus is smaller than that in one species due to the elevated genetic distance between the samples we collected. For example, In soybean, 80% genes are present in all seven accessions \cite{Li2014}. In rice , 61% of genes are present in 90% of total 62 collections\cite{Zhao2018}. In maize, it has been reported that only 50% percent of genome are in syntenic blockes between B73 and Mo17 \cite{Sun2018}. In the most distant related speices in Vitis genus we examine (between V. rupestris and V.vinifera), about 29% of the genome are coliear with the reference geome PN40024. However, we also found that the core genome in the Vits genus is enriched in gene region and is overlapped with about 70% of genes in the PN40024 genome. Although we only test 200 markers that span 1372 genes in the genome, we expect that transferable markes can be developed on the majority of genes in Vitis genus.