Do you know how many words where coined by Shakespeare in English language? let me give you a hint: a lot, including the word ‘hint’!. Some of these words are very common and frequently used, such as the adjective ‘obscene’ or the verb ‘to film’. Others seem to have been less successful polluting the language, such as ‘deracinate’ (whose root is latin word racine, that is: root), that means without roots or out of his homeland.
The number of words Shakespeare coined is amazingly large, at least 1000 words. I made a compilation of many (560, available below), taking them from many sources such as this, this, this, this, this, this, and this. Some say that these words were invented, others use the most conservative word coined and finally some are more accurate saying that most of these words are loanwords, because at least many of them were ‘borrowed’ from french or latin. I read some passionate discussions in forums about the use of the word invention for these cases. I kept it in the title (is more provocative) because I think is arguable that there’s no such a thing as a deracinated invention, something that came out of the blue without context, inspiration or reference.
This is the list of the 20 most common words according to its number of Google results:
advertising, 338000000
control, 227000000
design, 188000000
published, 169000000
road, 167000000
manager, 143000000
overview, 125000000
label, 87500000
useful, 83200000
cheap, 81600000
employment, 74800000
traditional, 74700000
secure, 73600000
lower, 70200000
switch, 58500000
successful, 52400000
investment, 47000000
import, 41800000
critical, 41100000
gossip, 35000000
Once I knew about the existence of this words, many of them being now common currency, my first question was if people whose mother language is English (not my case) were aware of this, and if the answer is yes, for which words. I want to coin here a new use for the adjective deracinate: I will say that a word is deracinated if no one knows what its root are, and I also propose a graduation in its use: a word is more or less deracinated according to the number of people using it that knows its roots, its etymology, its origin, its history.
I figure out a method that gives me information about how deracinated a word is. The method is based on the technique of multi-search, previously introduced and commented in this blog. The idea is to measure the number of times a word is used, and also the number of times it is used in a context in which the name of the writer is mentioned. Imagine there is a word that every time it’s used is in a text where the word Shakespeare is also present: we can conclude that this word is not deracinated at all. In the opposite case, a word commonly used that it’s never accompanied by a mention of Shakespeare seems to have a life on its own, far from its inventor, and very few remember its origin. Using the Google engine I made the following searches for each word: word and word + AND Shakespeare. Then I compare both results.
The following image depicts the 590 words on a Scatter plot.

Words in the up-left side are the ones that have a major proportion of occurrences in webpages mentioning Shakespeare. Words in the down-right have a lower proportion of occurrences in websites that mention the writer: they are the most deracinated. Among these words: priceless, to film, hint, obscene, and to undress.
It’s clear how, taking 10-based logarithms, points are distributed linearly. The deracination index is depicted by a temperature color scale (from black for the less deracinated to white for the most). The final index is defined by this formula:
log10[ G(text) + 1 ] / log10[ G(text + AND shakespeare) + 1 ]
This is the lists of the 20 most deracinated words coined by Shakespeare, with their Google search results, the Google search results adding AND Shakespeare, and their deracinated index values:
to bemad, 169, 2, 4.67
to over-red, 132, 2, 4.45
fashionmonger, 5060, 19, 2.84
eyewink, 8070, 23, 2.83
bullyrook, 400, 8, 2.72
fullhearted, 802, 12, 2.6
to canopy, 24700, 60, 2.46
nonregardance, 78, 5, 2.43
coppernose, 1430, 19, 2.42
overcredulous, 484, 12, 2.41
death’s-head, 3430, 30, 2.37
offenseful, 202, 9, 2.3
to uncurl, 4060, 36, 2.3
scrimer, 2120, 27, 2.29
to reverb, 5900, 44, 2.28
ring carrier, 2350, 31, 2.23
over-weathered, 884, 20, 2.22
to subcontract, 33200, 109, 2.21
to bedabble, 34, 4, 2.2
to outswear, 34, 4, 2.2
This is the lists of the 20 less deracinated words coined by Shakespeare, with their Google search results, the Google search results adding “AND Shakespeare”, and their deracinated index values:
to outfrown, 5, 5, 1
to bewhore, 2, 2, 1
to overstink, 4, 4, 1
to outsweeten, 3, 3, 1
bum-bailie, 4, 4, 1
to behowl, 3, 3, 1
to enclog, 11, 10, 1.03
to discandy, 7, 6, 1.06
fubbed off, 59, 41, 1.09
cross-gartered, 1050, 565, 1.09
to outscold, 5, 4, 1.11
flirt-gill, 396, 214, 1.11
to unfool, 21, 15, 1.11
to overbulk, 42, 28, 1.11
scroyle, 624, 311, 1.12
to unsex, 1020, 482, 1.12
semblative, 286, 153, 1.12
drollery, 29700, 9150, 1.12
arch-villain, 32700, 9330, 1.13
rug-headed, 214, 97, 1.17
Here it is the bare list of words.
Here it is the complete list of words with values, and in CSV format.
And this Impure space is the one used to perform all the Google search tasks and build the Table and the CSV. Once the space is opened it starts searching using the InternetMultiNSearchResults API module. It also contains a simple scatter visualization. As usual, Impure users can copy the code of this space in order to modify it and create new stuff. In general terms this structure allows measuring how much an item is related with a set of other items. In this case the first item is Shakespeare and the others are words or texts, but it could be a brand name and a list of adjectives, an adjective and a list of brand names, a politician and a list of corruption criminals, etc… All these are examples of one-to-many multi-searches but it could be of course many-to-many what lead to networks construction, a topic I will attend in further posts.