7. Future Works
The size and complexity of the dataset presented the biggest difficulty in creating the model. Although it is an expensive process, expanding the dataset will improve model performance. Since the time of postings was the sole factor taken into account in addition to users’ posts, it stands to reason that more information about users would be helpful to academics as well. The work of Benton et al. [14] on gender representation as an additional attribute may be useful. Alternatively, the authors could use it directly to Long Short-Term Memory (LSTM) networks, which are the main component of most cutting-edge models because they can start making use of long sequential input data as well as its going to order by wanting to avoid gradient vanishing and using recurrent connections. The authors could try using word embeddings currently averaging across all sayings as input towards the logistic regression model.