Google’s BERT transformer is an exciting and powerful tool that has been used to greatly enhance the quality of google’s search results and is open source. This means that I can train models with the BERT Architecture.
BERT allows you to predict a word or series of words that belong in a sentence based on the context. A trained BERT model could fill out the blank with the example sentence: ________ like to bark and are man’s best friend.
If you could train a transformer on a Reddit thread or subreddit, you could predict how the group posting would answer a new, unique question.
You can see my attempts here:
[https://github.com/antonecg/Deep-Learning/blob/main/SmallThreadBert.ipynb]
Initially, I used full-length posts but ran into time constraints (it would take weeks to run). I then created a model that ran on only a subset of posts, but the results thus far have not been satisfactory. The dataset that I was using, my harvested Reddit data, was far too small to create an accurate BERT Model.
Fortunately, there is cutting-edge (last few weeks) research trying to create transformer architectures that can be trained on smaller data sets. I am very excited about Deep Mind’s Retrieval Transformer architecture and home to start my implementation in the coming weeks.
https://deepmind.com/research/publications/2021/improving-language-models-by-retrieving-from-trillions-of-tokens