Vocab list: Pietr-le-Letton, Chapter 8

I’m making lists of unfamiliar words as I read George Simenon’s 1931 Pietr-le-Letton, the novel debut of the famous commissaire Maigret. Here’s my list for Chapter 8 (Maigret Ne Joue Plus) with links to definitions and word frequencies from Google Books NGram Viewer (warning: today’s frequency counts are wonky).

In this chapter, Maigret has been shot! Actually, that happened at the end of chapter 7, but I was unclear on the fact; all I had gleaned was that someone was shot in the final sentence of chapter 7, I hadn’t realized it was Maigret. In Chapter 8 he first spends a while stumbling around bleeding, then he makes his way back to the hotel where his officers were staking out the criminals, only to find one of them murdered via chloroform and a long needle to the heart. Finally, he calls in his Chief of Police, cleans himself up, and heads into the field once more to find the bad guys, ‘cuz now it’s personal!

Today’s list is largely words about wounds, bandages, nausea, blood stains, swelling, and lassitude. You know, everyday vocabulary.

expression (root)Frequency in 2010Frequency in 1970Frequency in 1930
fouler1 in 20,4001 in 21,2001 in 17,800
allure1 in 47,6001 in 45,8001 in 33,100
ballant1 in 50,2001 in 66,4001 in 61,300
desservir1 in 74,6001 in 85,9001 in 71,800
plaie1 in 94,5001 in 109,0001 in 67,300
caler1 in 143,0001 in 190,0001 in 161,000
gisait1 in 156,0001 in 194,0001 in 172,000
frôler1 in 183,0001 in 351,0001 in 391,000
béant1 in 226,0001 in 352,0001 in 317,000
dénicher1 in 278,0001 in 1,050,0001 in 1,070,000
pansement1 in 347,0001 in 567,0001 in 260,000
netteté1 in 361,0001 in 157,0001 in 100,000
recroquevillé1 in 404,0001 in 1,110,0001 in 1,610,000
happer1 in 420,0001 in 787,0001 in 811,000
souillure1 in 423,0001 in 501,0001 in 460,000
ahurissant1 in 445,0001 in 596,0001 in 576,000
raviser1 in 531,0001 in 1,130,0001 in 1,010,000
poindre1 in 628,0001 in 814,0001 in 729,000
bourrelet1 in 978,0001 in 259,0001 in 186,000
omoplate1 in 1,080,0001 in 1,350,0001 in 653,000
bougonner1 in 1,130,0001 in 2,310,0001 in 2,450,000
divaguer1 in 1,140,0001 in 1,680,0001 in 1,640,000
boursouflé1 in 1,430,0001 in 1,560,0001 in 1,350,000
tuméfier1 in 1,740,0001 in 2,810,0001 in 981,000
hébétude1 in 2,130,0001 in 3,010,0001 in 3,610,000
tournemain1 in 4,290,0001 in 5,130,0001 in 4,040,000
écoeurer1 in 5,780,0001 in 16,500,0001 in 35,900,000
écoeurement1 in 25,800,0001 in 50,200,0001 in 110,000,000

A few notable things today:

  • The word gisait means “was lying”, as in a dead body sprawled out on the floor. It’s commonly used for bodies, dead or alive, lying on surfaces. But the interesting thing is the infinitive is gésir, but all the conjugations start with gis-. Apparently it is also used only in restricted tenses: présent indicative, imparfait indicative, and present participle. I’ve never encountered this pattern before.
  • The word une plaie means a wound. The frequency of this words usage in books is fascinating:
The word “une plaie” means “a wound”. Any guesses what happened from 1914 – 1918 to cause this spike in usage of the word “plaie” in French books?
  • That spike around 1916? That’s the First World War. I don’t know why there isn’t a similar spike during World War II. All the wounded soldiers died, so the wounds weren’t worth writing about? A different word was adopted to describe these wounds? Nobody had time to write about it? Or maybe these books are just not in Google’s data for some reason.
  • The word écoeurement (disgust, nausea) is the rarest on this list — a whopping 1 in 26 million these days. But it’s not that hard to find on the Web, so I wonder if it’s just not a bookish word? Note that the word is having a resurgence. When Simenon selected it, the word has a prevalence in print of just 1 in 110 million !
  • Google NGram Viewer released a new corpus this week, with data running all the way up to 2019. So I shifted my window to look at the years 1930, 1970, and 2010. Recall the book was written in 1931, so the 1930 data is the environment Simenon was writing in.
  • That said, the frequencies are not entirely trustworthy at the moment. I think the new release does very aggressive pooling. So for example, ballant (dangling) is broken by its conflation with balle (a ball). I’m sure the “dangling” meaning is more rare than 1 in 50,000 words. I’ll work to get these cleaned up before long, but meanwhile I don’t trust the frequencies more common than 1 in 100,000

Vocab list: Pietr-le-Letton, Chapter 7

I’m making lists of unfamiliar words as I read George Simenon’s 1931 Pietr-le-Letton, the novel debut of the famous commissaire Maigret. Here’s my list for Chapter 7 (Troisième Entracte) with links to definitions and word frequencies from Google Books NGram Viewer.

In this chapter, Maigret stops briefly at the hotel where a person of interest is staying, then follows them first to the theater and then to a cabaret nightclub. I’m a fan of French theater, so many theater-specific vocab words did not make it onto this list (though some did). The list is largely words about coming, going, dining, and dressing.

I’ve augmented my frequency tables based on a comment from reader F. P., who suggested that I display both modern word frequency and contemporaneous frequency. The most recent data I have from the Google NGram Viewer ends at 2008. Pietr-le-Letton was published in 1931, but I rounded back to 1928 for aesthetics. 1968 falls midway between these two.

To put these numbers in some perspective, a typical novel is 60,000 to 100,000 words long, and real hefty novels top out around 500,000 words. So when you see a word frequency of 1 in 1,000,000 you should think “I could read 5-10 novels and never see this word or its variants.” Recall the frequencies shown pool together various inflections of the word, so the row for matelassé is really all of matelassé, matelassée, matelassées, matelassés, matelasser, matelasse, matelassent, matelassant, and matelassait combined.

expression (root)Frequency in 2008Frequency in 1968Frequency in 1928
ruée1 in 7,4201 in 7,0201 in 5,920
dresser1 in 25,2001 in 18,7001 in 14,200
soulevé1 in 27,4001 in 21,2001 in 18,500
lasse1 in 30,6001 in 31,6001 in 32,400
cerne1 in 64,3001 in 104,0001 in 236,000
cernée1 in 64,3001 in 104,0001 in 236,000
coulisses1 in 238,0001 in 132,0001 in 224,000
affermissant1 in 328,0001 in 200,0001 in 176,000
crispé1 in 418,0001 in 373,0001 in 453,000
corbeille1 in 433,0001 in 431,0001 in 259,000
croquer1 in 518,0001 in 808,0001 in 874,000
Mâcon1 in 606,0001 in 512,0001 in 476,000
vergogne1 in 662,0001 in 754,0001 in 925,000
navré1 in 677,0001 in 564,0001 in 455,000
badaud1 in 818,0001 in 774,0001 in 813,000
blanchâtre1 in 864,0001 in 484,0001 in 293,000
réverbère1 in 886,0001 in 659,0001 in 825,000
bleuté1 in 955,0001 in 919,0001 in 932,000
crépitant1 in 1,010,0001 in 796,0001 in 1,030,000
désaltérer1 in 1,160,0001 in 1,590,0001 in 1,190,000
crotté1 in 1,250,0001 in 1,470,0001 in 1,220,000
piétinements1 in 1,320,0001 in 936,0001 in 1,470,000
hargneux1 in 1,560,0001 in 1,090,0001 in 1,100,000
emmitouflée1 in 2,340,0001 in 3,280,0001 in 3,960,000
péristyle1 in 2,450,0001 in 1,380,0001 in 970,000
débraillé1 in 2,760,0001 in 1,590,0001 in 1,480,000
entrefilet1 in 3,080,0001 in 2,590,0001 in 2,070,000
plastron1 in 3,210,0001 in 2,290,0001 in 1,630,000
lestement1 in 3,630,0001 in 3,880,0001 in 1,450,000
matelassé1 in 7,210,0001 in 5,220,0001 in 6,830,000
contremarque1 in 10,500,0001 in 26,000,0001 in 12,300,000
maigriote1 in 75,900,0001 in 27,400,0001 in 23,400,000
panneau-réclame

F. P. also suggested that I sort the words by frequency, which I have using 2008 data. Those interested in the details of the data generation can read my code.

Looking down the first column of the table, I see that there’s a few words I was unfamiliar with that are currently more common than 1 in 100,000 words of book text. But the bulk of the new-to-me words are more rare than that, and many are rarer than one-in-a-million. And recall, this statistic pools together various inflections of the word (so matelassé is really all of matelassé, matelassée, matelassées, matelassés, matelasser, matelasse, matelassent, matelassant, and matelassait combined).

Looking across the rows, you can see which words were rare even in Simenon’s time, and which were relatively common then but have since fallen out of favor. For example, blanchâtre is currently a 1-in-864,000 word, though when Simenon wrote it was only a 1-in-293,000 word. Likewise péristyle was a one-in-a-million word then, but has become 2.5x more rare since. On the other hand, lasse was pretty common then and is pretty common now, piétinements was very rare then and now, and maigriote was already off the charts rare in 1928, coming in at a whopping 1-in-23,400,000 (it’s 3x as rare now, but…).

Vocab list: Pietr-le-Letton, Chapter 6

I’m making lists of unfamiliar words as I read George Simenon’s Pietr-le-Letton. Here’s my list for Chapter 6 (Au Roi de Sicile), with links to the search result page on Linguee and word frequencies from the Google NGram Viewer.

In this chapter, Maigret follows up a lead in a run down building in the Jewish quarter of town, near rue de Rosiers in le Marais. Simenon explicitly calls this place «le ghetto de Paris». He interviews the building manager, a not-very-cooperative Jew. The vocabulary has a lot of words about ragged, crowded, noisy, dilapidated, damp and dirty conditions.

28 unfamiliar words in 7 1/2 pages is getting up there, but still less than 1 in 5, which is the cutoff for a “just right book”.

expression (root)frequency
bondé1 in 742,000
détrempé1 in 2,040,000
patauger1 in 1,220,000
ballotté1 in 834,000
pain azyme1 in 10,200,000
grouillante1 in 1,330,000
grouillement1 in 3,190,000
faïence1 in 677,000
étayer1 in 2,360
boyau1 in 912,000
calotte1 in 971,000
crasseux1 in 1,330,000
empâtée1 in 3,710,000
peignoir1 in 1,500,000
entrouvrir1 in 382,000
esclandre1 in 3,310,000
ameuter1 in 1,470,000
grommeler1 in 942,000
parois1 in 69,100
crayeux1 in 3,880,000
sournois1 in 482,000
effaré1 in 712,000
loqueteux1 in 6,740,000
verdâtre1 in 923,000
clapoter1 in 5,110,000
vol à l’esbroufeNone
en faction1 in 2,420,000
pestant1 in 102,000
ronfler1 in 983,000

The frequency numbers are from the French Google Books corpus, specifically books published in 2007. They count how many words of such books you would have to read on average before coming upon the given word in any of its inflected forms. As you can see, a lot of these are fairly literary or old-fashioned words – the Pietr-le-Letton was written in 1931, after all.

There’s a few glitches in this analysis. The word étayer (meaning “to support”), is not so common you’d see it once in 2,360 words. Rather, Google NGram Viewer is conflating the 3rd person plural imparfait of the verb être (ils étaient) with the 3rd person plural present of the verb étayer (ils étaient). Same spelling, very different frequency. So take the frequency estimates with a grain of salt

Vocab list: Pietr-le-Letton, Chapter 5

I’m making lists of unfamiliar words as I read George Simenon’s Pietr-le-Letton. Below is my list for Chapter 5 (Le Russe Ivre), with links to the search result page on Linguee and word frequencies from the Google NGram Viewer.

The chapter takes place in a run-down bar in a fishing town (Fécamp) in winter, which accounts for why there are so many words about boats, bars, and rain. There’s 26 words here and the chapter is 9 pages long, so that’s about 3 new words a page – a “just right book” for my reading level.

expression (root)frequency
prunelles1 in 742,000
bouges1 in 61,200
soutiers1 in 11,100,000
zinc1 in 396,000
canaille1 in 690,000
entrebâillement1 in 4,290,000
crapuleux1 in 1,690,000
louvoyer1 in 1,640,000
luisant1 in 670
oeillade1 in 13,900,000
se saouler1 in 5,040,000
vergue1 in 1,610,000
tressaillir1 in 454,000
heurter1 in 48,400
toussotement1 in 11,600,000
buée1 in 1,670,000
ricaner1 in 528,000
bac1 in 82,000
tremper1 in 140,000
tiraillait1 in 594,000
bec-de-cane1 in 19,800,000
tournant1 in 8,540
marchand de bestiaux1 in 17,500,000
entrouverte1 in 382,000
blême1 in 860,000
tasser1 in 166,000

The frequency numbers are from the French Google Books corpus, specifically books published in 2007. They count how many words of such books you would have to read on average before coming upon the given word in any of its inflected forms. As you can see, a lot of these are fairly literary or old-fashioned words – the Pietr-le-Letton was written in 1931, after all. There’s a few glitches in this analysis. The word luisant, from luire = to shine, is not so common you’d see it once in 670 words. Rather, Google NGram Viewer thinks that lui is a form of luire. As far as I can tell, that’s outright wrong, but of course the pronoun lui is very common and so the conflation makes the estimate worthless. The single form luisant occurs 1 in 1,160,000, but that doesn’t account for all the other forms of luire. So take the frequency estimates with a grain of salt

I’ll be curious to see if my list length diminishes in later chapters and later novels. I’m reminded of the game I used to play when reading Sherlock Holmes stories aloud with my daughter – we’d joke about how many paragraphs into a story Conan Doyle could get without using the word “singular”. It was rarely double-digit.

Lesson 2020-07-01

My lesson with my teacher N today was mostly conversation (tout en français, bien sur), and mostly what we discussed was the process of creating this website, www.monsieurmiller.com. Turns out I really don’t know how to pronounce the first syllable of monsieur. In general, my pronunciation is pretty terrible, but that’s an awkwardly beginner word for me not to have the correct pronunciation ingrained.

In the discussion, we talked over the nuances of construire, créer, and édifier, and decided that créer was the best word for the start of a new website. Overall good exercise of technical web vocabulary domaine, lien, site, enregistrer, navigateur, onglet, etc. Tried to articulate the difference between a page and a post, which is not clear to me even in English. N asked me whether I intended to make the site bilangue, which for the time being I am not. Once I get my feet under me I may try writing some all-French posts.

We spent a little time looking at the several idiomatic expressions using the word lieu, following this quiz from www.partajondelfdalf.com, a site I had not encountered before.

Other tidbits: the expression en avoir marre de keeps tripping me up, as I think of en as absorbing the final de as in “Essais d’ouvrir la porte” –> “J’en ai essayé.” But you need both the en and the de in that expression “Ma famille en a marre de m’écouter parler de la France.” and not “Ma famille a marre de m’écouter parler de la France.