2.5.2 Analysis of RNA‐Seq data
The raw RNA‐Seq reads were processed to remove sequences with adapter
contamination, low‐quality nucleotides in excess of 10% and unknown
nucleotides greater than 50 using Trimmomatic
(Bolger et al., 2014). Processed sequences
with length less than 40 bp were also got rid of. Ribosomal RNA
sequences were removed after been discovered by aligning reads to
ribosomal RNA database (Quast et al.,
2012). The remaining clean reads were mapped to reference tomato genome
using HISAT software that allowed up to two mismatches
(Kim et al., 2015). The expression level
of each gene was determined by counting the number of fragments that
mapped to each gene and then normalized to number of fragments per
kilobase of transcript sequence per millions (FPKM) base pairs sequenced
using HTSeq software. A gene with FPKM value ≥ 0.1 was considered
expressed. To recognize differentially expressed genes (DEGs), the FPKM
values of each gene from WW and DS anthers were analyzed using DESeq
software (Anders and Huber, 2010). A rigid
cut‐off, ǀ log2 fold change ǀ >1 and
p-adjusted < 0.05 was set as thresholds to consider a gene
significantly differentially expressed. GOseq
(Young et al., 2010) was used to analyze
functional enrichment of specific gene ontology (GO) terms for DEGs.
KEGG pathways significantly enriched with DEGs were determined using the
KOBAS software.