Nuclear transcript content varies among cell types and genes. (A) Box plots showing median (bars), 25th and 75th quantiles (boxes), and range (whiskers) of percentages of reads mapping to introns for matched nuclei and cell clusters. (B) Box plots of log2-transformed expression of the nuclear non-coding RNA, Malat1, in matched nuclei and cell clusters. (C) The nuclear fraction of transcripts in cell types was estimated with two methods: the ratio of intronic read percentages in cells compared to nuclei; and the average ratio of expression in cells compared to nuclei of three highly expressed genes (Snhg11Meg3, and Malat1) that are localized to the nucleus. The relative ranking of nuclear fractions was consistent (Spearman rank correlation = 0.84), although estimates based on the intronic read ratio were consistently 50% higher. (D) Estimated nuclear proportion (ratio of nucleus and soma volume) of neurons labeled by three mouse Cre-lines in Layers 4 and 5 (see Figure \ref{684950}D). Single neuron measurements (grey points) were summarized as violin plots, and average nuclear proportions (black points) were compared to the range of estimated proportions (blue lines) based on intronic read ratios and nuclear gene expression. (E) Histograms of nuclear fraction estimates for 11,932 genes expressed (CPM > 1) in at least one nuclear or cell cluster and grouped by type of gene. (F) Violin plots of marker score distributions with median and inter-quartile intervals. Non-coding genes and pseudogenes are on average better markers of cell types than protein-coding genes. Kruskal–Wallis rank sum test, post hoc Wilcoxon signed rank unpaired tests: *P < 1 x 10-50 (Bonferroni-corrected), NS, not significant. (G) Box plots of cell type marker scores for genes grouped by estimated nuclear enrichment. Nucleus-enriched genes have significantly higher marker scores (linear regression; P = 2.3 x 10-8). (H) Validation of the estimated nuclear proportion of transcripts for Calb1Grik1, and Pvalb using multiplex fluorescent in situ hybridization (mFISH). Top: For each gene, transcripts were labeled with fluorescent probes and counted in the nucleus (white) and soma (yellow). Bottom: Probe counts in the nucleus and soma across all cells with linear regression fits to estimate nuclear transcript proportions for each gene. Estimated proportions based on mFISH and RNA-seq data are summarized on the right.