I’m making lists of unfamiliar words as I read George Simenon’s 1931 Pietr-le-Letton, the novel debut of the famous commissaire Maigret. Here’s my list for Chapter 7 (Troisième Entracte) with links to definitions and word frequencies from Google Books NGram Viewer.
In this chapter, Maigret stops briefly at the hotel where a person of interest is staying, then follows them first to the theater and then to a cabaret nightclub. I’m a fan of French theater, so many theater-specific vocab words did not make it onto this list (though some did). The list is largely words about coming, going, dining, and dressing.
I’ve augmented my frequency tables based on a comment from reader F. P., who suggested that I display both modern word frequency and contemporaneous frequency. The most recent data I have from the Google NGram Viewer ends at 2008. Pietr-le-Letton was published in 1931, but I rounded back to 1928 for aesthetics. 1968 falls midway between these two.
To put these numbers in some perspective, a typical novel is 60,000 to 100,000 words long, and real hefty novels top out around 500,000 words. So when you see a word frequency of 1 in 1,000,000 you should think “I could read 5-10 novels and never see this word or its variants.” Recall the frequencies shown pool together various inflections of the word, so the row for matelassé is really all of matelassé, matelassée, matelassées, matelassés, matelasser, matelasse, matelassent, matelassant, and matelassait combined.
expression (root) | Frequency in 2008 | Frequency in 1968 | Frequency in 1928 |
---|---|---|---|
ruée | 1 in 7,420 | 1 in 7,020 | 1 in 5,920 |
dresser | 1 in 25,200 | 1 in 18,700 | 1 in 14,200 |
soulevé | 1 in 27,400 | 1 in 21,200 | 1 in 18,500 |
lasse | 1 in 30,600 | 1 in 31,600 | 1 in 32,400 |
cerne | 1 in 64,300 | 1 in 104,000 | 1 in 236,000 |
cernée | 1 in 64,300 | 1 in 104,000 | 1 in 236,000 |
coulisses | 1 in 238,000 | 1 in 132,000 | 1 in 224,000 |
affermissant | 1 in 328,000 | 1 in 200,000 | 1 in 176,000 |
crispé | 1 in 418,000 | 1 in 373,000 | 1 in 453,000 |
corbeille | 1 in 433,000 | 1 in 431,000 | 1 in 259,000 |
croquer | 1 in 518,000 | 1 in 808,000 | 1 in 874,000 |
Mâcon | 1 in 606,000 | 1 in 512,000 | 1 in 476,000 |
vergogne | 1 in 662,000 | 1 in 754,000 | 1 in 925,000 |
navré | 1 in 677,000 | 1 in 564,000 | 1 in 455,000 |
badaud | 1 in 818,000 | 1 in 774,000 | 1 in 813,000 |
blanchâtre | 1 in 864,000 | 1 in 484,000 | 1 in 293,000 |
réverbère | 1 in 886,000 | 1 in 659,000 | 1 in 825,000 |
bleuté | 1 in 955,000 | 1 in 919,000 | 1 in 932,000 |
crépitant | 1 in 1,010,000 | 1 in 796,000 | 1 in 1,030,000 |
désaltérer | 1 in 1,160,000 | 1 in 1,590,000 | 1 in 1,190,000 |
crotté | 1 in 1,250,000 | 1 in 1,470,000 | 1 in 1,220,000 |
piétinements | 1 in 1,320,000 | 1 in 936,000 | 1 in 1,470,000 |
hargneux | 1 in 1,560,000 | 1 in 1,090,000 | 1 in 1,100,000 |
emmitouflée | 1 in 2,340,000 | 1 in 3,280,000 | 1 in 3,960,000 |
péristyle | 1 in 2,450,000 | 1 in 1,380,000 | 1 in 970,000 |
débraillé | 1 in 2,760,000 | 1 in 1,590,000 | 1 in 1,480,000 |
entrefilet | 1 in 3,080,000 | 1 in 2,590,000 | 1 in 2,070,000 |
plastron | 1 in 3,210,000 | 1 in 2,290,000 | 1 in 1,630,000 |
lestement | 1 in 3,630,000 | 1 in 3,880,000 | 1 in 1,450,000 |
matelassé | 1 in 7,210,000 | 1 in 5,220,000 | 1 in 6,830,000 |
contremarque | 1 in 10,500,000 | 1 in 26,000,000 | 1 in 12,300,000 |
maigriote | 1 in 75,900,000 | 1 in 27,400,000 | 1 in 23,400,000 |
panneau-réclame | — | — | — |
F. P. also suggested that I sort the words by frequency, which I have using 2008 data. Those interested in the details of the data generation can read my code.
Looking down the first column of the table, I see that there’s a few words I was unfamiliar with that are currently more common than 1 in 100,000 words of book text. But the bulk of the new-to-me words are more rare than that, and many are rarer than one-in-a-million. And recall, this statistic pools together various inflections of the word (so matelassé is really all of matelassé, matelassée, matelassées, matelassés, matelasser, matelasse, matelassent, matelassant, and matelassait combined).
Looking across the rows, you can see which words were rare even in Simenon’s time, and which were relatively common then but have since fallen out of favor. For example, blanchâtre is currently a 1-in-864,000 word, though when Simenon wrote it was only a 1-in-293,000 word. Likewise péristyle was a one-in-a-million word then, but has become 2.5x more rare since. On the other hand, lasse was pretty common then and is pretty common now, piétinements was very rare then and now, and maigriote was already off the charts rare in 1928, coming in at a whopping 1-in-23,400,000 (it’s 3x as rare now, but…).