By utilizing the Natural Language Tool Kit (NLTK), a python library, we can begin to analyze the headlines to examine the types of words utilized by these authors. In order to parse the headlines, we first need to prepare the text. Some preparatory steps include, downloading NLTK and its relevant dependencies; split the sentence up into words on empty spaces; tokenize to separate words from punctuation; and remove stop words which are high occurring words from a pre-identified list so not to skew the word counts popular within the headlines.
The results are that the term “bitcoin” dominates the top 4 words in the headlines and the runner up is “ransom”. For a closer look, I looked up the headlines of articles containing the word “ransom” to find more information. The articles were published in 8/2017, 5/2017 , and 2/2016 covering stories about misuse of bitcoin. For example the most recent article comprised of a criminal demanding payment in bitcoin for the return of a hostage while the other articles were in relation to comprises due to hacking.