Identifying Anachronisms in Your Fiction with Google's Ngram Viewer

photo of hourglass

Photo by Aron Visuals on Unsplash

Anachronistic language in historical fiction

An anachronism is:

"an error in chronology, especially : a chronological misplacing of persons, events, objects, or customs in regard to each other" [1]

This can refer to the past intruding on the present, or the present on the past. For writers of historical fiction who care about historical accuracy, the most important type of anachronism is the unintentional inclusion of modern objects or language in stories set in the past. In this blog post, I'll concentrate on anachronistic language: words and phrases that were not yet in common usage at the time a novel is set.

Many anachronisms are obvious and easy to avoid. You won't have your 19th-century housemaid twerking, dabbing, or googling the address of the jewellers where the Duchess of Devonshire left her emeralds to be cleaned.

However, others are more subtle. Take the word shellshocked, for instance, as in: "As Alice ran from the burning building, she had to force her way through a crowd of shellshocked onlookers". Did you know this word is related to the artillery shells used in WW1? It was coined in 1915. Although it's likely that few or none of your readers will know the exact date, many will be aware of the word's association with the First World War, and won't appreciate its use in your medieval romance or ancient Roman military fiction. Similarly, your 19th-century characters should not be wearing trenchcoats, as this term originated in the trenches in WW1. Dogfight and camouflage are other examples of words coined the 1910s. Some other expressions, like no man's land, did exist pre-WW1, but were popularised during the war, as many readers will know.

How can you determine when a given word or phrase first came into common usage? This is the field of learning called etymology, "the study of the origin of words and the way in which their meanings have changed throughout history"[2].

An etymology dictionary is what you need, but unfortunately the best dictionaries, such as the OED, are behind a paywall. If you don't have access to one, searching for "your phrase" + "etymology" is usually a good place to start.

If you're looking for a free alternative, one very useful tool is Google's Ngram Viewer, which shows when a word first came into popular usage or fell out of favor.

Using the Ngram Viewer to check for historical accuracy in your fiction

Let's take a look at some examples of the Ngram Viewer in action.

plot of usage over time from ngram viewer

The term "black market" originated during wartime rationing in WW2 and would be completely out-of-place in any story set prior to the war.

plot of usage over time from ngram viewer

"Hectic fever" was a 19th-century medical diagnosis associated with tuberculosis.

plot of usage over time from ngram viewer

Weapons designed to cause fires have existed for millennia, one famous example being the Byzantine Empire's "Greek fire". However, such weapons have only been referred to as "incendiary devices" since the Second World War.

plot of usage over time from ngram viewer

"Old chap" is a form of address used in British English. It has been steadily declining in popularity since the 1920s.

plot of usage over time from ngram viewer

At a "penny wedding", the guests each paid a penny towards the costs. The expression describing this 19th-century cultural practice has fallen into disuse.

plot of usage over time from ngram viewer

"Persiflage" is light, mocking banter. This word of French origin entered the English language in the 18th century, but later declined in popularity, and is unlikely to be part of the vocabulary of a late 20th-century character.

plot of usage over time from ngram viewer

A "sniper" is someone who shoots with high accuracy from a place of concealment. The word is first attested in the 1820s, coming from British India, where the species of bird known as the "snipe" was considered particularly difficult game to hunt. Despite this relatively early date of first usage, the term did not become widespread until the First World War. Prior to that, marksmen were much more commonly referred to as "sharpshooters". As this example highlights, dates of first attestation in etymological dictionaries are not always reliable guides to actual usage.

plot of usage over time from ngram viewer

"Macadam" is a road covering made of crushed compacted stone bound with tar and bitumen, invented by John McAdam around 1820. "Tarmacadam" or "tarmac" is an improvement on McAdam's method, patented by Edgar Purnell Hooley in 1902. Unless your characters are engineers building a road, you're unlikely to be interested in the exact technical differences between the two techniques. However, you'll want the drive to your turn-of-the-century English country house or Texan mansion to be made of macadam, not tarmac. Note also that in both cases, the terms' coinage pre-dates by several decades their entry into widespread usage.

plot of usage over time from ngram viewer

"Trenchcoats" originated in the trenches of WW1. A 19th-century man might have worn a "greatcoat" instead.

Side note: In case you're wondering where the Ngram Viewer's strange name comes from, an ngram or n-gram (usually spelled with a hyphen) is simply a sequence of words. For example, "stranger things", "and he", and "going down" are all 2-grams, "what do you" is a 3-gram, and so on.

Limitations of the Ngram Viewer

Although the Ngram Viewer is an extremely useful tool, it has several limitations you should be aware of.

Google's results are based on an analysis of written texts only. Words that people often use in speech but not in written English will be under-represented. This also means that the tool only reflects the language usage of the literate classes of society, and not the less well educated.

Note also that the underlying data is not completely reliable. There are errors, notably related to the difficulties Google's Optical Character Recognition technology has recognizing the letter S in some old texts. In general, words containing the letters S and F are most likely to display inaccurate results.

Another problem arises from how Google determines publication dates. For example, a 1990 reprint of Pride and Prejudice may be mis-classed as a text written in 1990.

These minor problems don't stop the Ngram Viewer being one of the best resources out there for writers of historical fiction who care about historical accuracy and avoiding anachronistic language.

Page Counts and Word Counts by Literary Genre

3 Essential Steps to Set Up Your Smartphone for Writing

How a Computer Detects Adverbs and Other Parts of Speech