How to Use the Google Ngram Viewer: Overdose


Google has an interesting tool to play with, Google Ngram Viewer. It allows you to see how the percentage of words plot out over time based on what is termed “a corpus of books” and gives you an idea of how words are trending in the popular culture. In other words, it provides a graph of how often a word or phrase is mentioned in the texts of each year by percentage. This allows for comparing the percentage use of a particular word in published books across time. The corpus is composed of the large number of books that Google has scanned in from public libraries.

I thought it would be interesting to look at a few words in the substance use and mental health fields, things that are of interest in the news presently. So let’s start with overdose in this post since it has gotten quite a lot of attention lately. I used the case sensitive and English-only options for the word overdose. In Figure 1 I chose a smoothing factor of 3 as recommended to make the graph more readable. To initially get a broad overview the first graph spans the time period from 1500 to 2008.

From the Figure 1 graph we can see that the word overdose does not appear in print before the 1720’s. It appears that the term may not have entered the lexicon until this time. Also, to investigate the plateaus in the data in Figure 1, I reduced the smoothing factor to 0 in Figure 2 to determine during which years the term ‘overdose’ was used more often.



Figure 1

If you move your cursor along the x-axis in Figure 2 you can see that the first mention of the word overdose is in 1724 and 1725. However, when performing a Google Books search for 1724 and 1725 we find references that, instead of years, represent numbered laws in the California legal code and page numbers in journals and nursing texts. This illustrates why it is important to drill down to actually look at the published books of the time period in Google Books. It appears that before 1812 the word overdose referred to elements applied in excess for agricultural purposes. In 1812 we find the first mention in Google Books of overdose in conjunction with a drug, opium.



Finally, in Figure 3, the N-gram graph reveals the mention of the word overdose increasing very gradually until the mid-1960’s at which time the percent of mentions increases by about a factor of 4. The trend has been steeply upward since then with the exception of a downward turn from 2001 to 2008. Data is not available after 2008 unfortunately.



These trends reflect the frequency with which the phenomenon of overdose has been written about in the totality of Google books. Creating and viewing these Ngrams gives us information that, while limited in validity, can stimulate our thinking about how the written language reflects the issues of various time periods. Not exacting but interesting nonetheless.

You can learn more about Ngram viewer here.