Polygenic Score

We construct a polygenic score to predict educational attainment of married men and women using data from Add-Health, building upon the recent findings from a large scale genome-wide association study (GWAS) of educational attainment \cite{Okbay_2016}. Rather than focusing on a limited number of genetic variants, polygenic scores (PGSs) use the entire information in the DNA (or a large proportion of it) to construct a measure of genetic predisposition to higher educational attainment  \cite{Plomin_2010,Domingue_2014,Conley_2015}.
 Recent advances in molecular genetics have made it possible and relatively inexpensive to measure millions of genetic variants in a single study. The most common type of genetic variation among people is called single nucleotide polymorphism (SNP). SNPs are genetic markers that have two variants called alleles. Since individuals inherit two copies for each SNP, one from each parent, there are three possible outcomes: 0, 1 or 2 copies of a specific allele. SNPs occur normally throughout a person’s DNA. Each SNP represents a difference in a single DNA building block, called a nucleotide.
 We generate a polygenic score based on the most recent GWAS results on educational attainment available \cite{Okbay_2016}. The same polygenic score is used in the analysis of years of education and college attainment, since the genetic correlation between the two measures is very high, with the point estimate suggesting a perfect genetic correlation. 
Using the summary statistics pubblicly available from the Social Science Genetic Association Consortium (http://www.thessgac.org), we construct a linear polygenic score weighted for their effect sizes in the meta-analysis. the score is constructed using the softwares PLINK and PRSice \cite{Purcell_2007,Euesden_2014}. We use the complete set of available SNPs (p-value<1), the score is then clumped using the genotypic data as a reference panel for Linkage Disequilibrium structure. We finally standardise the score to have mean equal to zero and standard deviation equal to one. Figure \ref{dist} shows the distribution of the unstandardised PGS for educational attainment calculated for  9,926  individuals in Add-Health.