In silico analysis of gene expression and tissue-specific basal exon skipping
Based on these results suggesting that truncating variants are associated with a more severe clinical picture, we were interested to assess the applicability of exon skipping therapies to rescue truncating variants (Ramsbottom et al., 2018). Basal exon skipping events are particularly informative as they indicate that skipping of these particular exons is likely well tolerated.
CEP120 and CC2D2A are ubiquitously expressed in human tissues, with highest expression levels in the female reproductive system and cerebellum for the former and smooth muscle and female reproductive system for the latter (Figure S3). Expression of both genes has also been reported in the human kidney (Figure S3 & http://www.proteinatlas.org, (Uhlén et al., 2015)). Cerebellum and kidney phenotypes are classically encountered in ciliopathies (Bachmann-Gagescu et al., 2012; Vilboux et al., 2017). Using human RNA sequencing data available through the Genotype-Tissue Expression (GTEx) project (https://www.gtexportal.org/home/), we investigated the tissue-specific expression and splicing of CEP120 andCC2D2A in kidney and cerebellum. ENST00000306481 (“transcript 1”) is the main CEP120 transcript detected in the kidney medulla and the cerebellar hemisphere. Abundant expression of ENST00000328236 (“transcript 2”) and ENST00000306467 (“transcript 3”) were also detected in the cerebellum but nearly absent in the kidney (Figure 2Ai). These transcript isoforms are generated through alternative splicing events at the pre-mRNA 5’-end, with exon 2 (ref.: ENST00000328236 or NM_153223) predicted to be skipped in the kidney (Figure 2Aii). These changes are reflected by a predicted protein product lacking the first 26 amino acids for transcript 1 (Figure 2Aiii).
For CC2D2A , the main protein coding transcripts in kidney medulla are ENST00000515124 (“transcript 1”) and ENST00000503292 (“transcript 2”) and in the cerebellar hemisphere are “transcript 2” and ENST00000389652 (“transcript 3”) (Figure 2Bi). Transcript 1 is short (1474bp), lacking functional CC2D2A domains and generated by splicing in an additional exon after exon 5 (ref.: ENST00000503292), leading to a premature stop codon. This transcript is supported by the detection in GTEx of the specific junction in nearly all tissues but enriched in the kidney (Figure 2Bii). Transcript 3 is detected in cerebellum but not kidney and has an incomplete open reading frame with the 5’-end not fully annotated. However, based on GTEx junction expression data, exon 2 appears to be spliced in the kidney but not the cerebellum, while an exon predicted in the cerebellum (between exons 30 and 31) is skipped in the kidney. Of interest, a splice junction leading to skipping of exon 30 is detected at low frequency and almost exclusively in the kidney medulla (Figure 2Bii). On the protein level, transcript 1 encodes a 111 amino acid product, sharing the first 82 amino acids with canonical transcript 2 (Figure 2Biii). In summary, human RNAseq data suggest the presence of tissue-specific transcripts for CEP120 and CC2D2A . Exons that are predicted to undergo organ-specific splicing events, such as exon 30 of CC2D2A , represent optimal candidates to apply exon skipping strategies. Isoform expression predicted from RNA sequencing data must however be interpreted with caution and specific isoforms should be confirmed by dedicated RT-PCR (Molinari et al., 2018).