Both HRTC Analytics and Voyant Tools have the ability to isolate word frequencies via both lists and word clouds. This enables the side-by-side comparison displayed below, where the left-hand list and word cloud reflect the language used in turn of the century weight lost texts, and the right-hand list and word cloud reflect the language used in the weight loss verticals of Women's Health and Men's Health in 2019.
NOTE: Before viewing these side-by-side results, the reader should be aware of the following: (1) this study compares the text of books with that of a website, the latter consisting mostly of headlines, so the type of texts being compared is significantly different and (2) the weight loss verticals are both from one day in the year 2019, while the turn of the century corpus spans a time frame of thirty years. You can find more discussion of this on the Future Possibilities page of this site.
Comparing the language in this way allows the viewer to immediately observe a few things. One is the prevalence of the term "fat" in turn of the century texts, a term that does not appear among the top ten words that appear in the weight loss verticals for Men's Health and Women's Health or indeed anywhere in the complete 59-word frequency list for the weight loss verticals corpus. The word "fat" has been replaced as #1 by the term "weight," suggesting that this has become the preferred term for poundage one may want to lose.
Speaking of losing, three different iterations of the concept - "loss," "lose" and "lost" - appear in the top ten list of words used in the weight loss verticals corpus. Comparatively, we don't see any iteration of this concept in the turn of the century texts until well down the list: "lose" is the 365th most common term, with "lost" appearing at 395 and "loss" at 464. That all of these words appear in both corpuses is significant, as it reveals that language implying deprivation has always been part of the weight loss vocabulary (the first definition for "lose" in the Oxford English Dictionary is "Be deprived of or cease to have or retain"). However, it's also significant that these words appear so high on the list of current texts and so low on the list of turn of the century texts. We might observe that the language in the latter is more about the specifics of how one loses weight - with food terms especially prominent in the first 100 words - rather than the loss itself.
An exception is the prominence of the term "keto" in the weight loss verticals. "Keto" refers to a low- to no-card diet in which “the absence of carbs and abundance of fat pushes your body into a biological state called ketosis, during which you burn fat instead of glucose" (Easter). Very particular diets with recognizabe names like this - sometimes called "fad diets" - have been around since the early days of dieting (see Background), and some examples exist in the turn of the century weight loss corpus, most significantly "banting" at 652 and "mahdah" at 966. While the Mahdah diet is the sole topic of one of the books in the turn of the century corpus, it still does not rate particularly high on the frequency list, as many different modes of losing weight are discussed. By comparison, keto appears to dominate the weight loss sections of both Men's Health and Women's Health.
While the language used in Men's Health and Women's Health does appear to share many similarities - from the prevalence of the word weight to the focus on loss and losing - there are a few notable exceptions revealed by Voyant's Trends tool. Below are smaller version of the line graphs that appear on the Results page, displayed here side-by-side for ease of comparison. In each of these graphs, corpus item 1 - which appears as the left-hand column - is the weight loss vertical for Men's Health, where corpus item 2 - which appears as the right-hand column - is the weight loss vertical for Women's Health. This numbering was decided by Voyant based on the order that the corpus items were fed into the system.
The line graph in the upper left corner, which shows Vector's suggested terms, reveals that while "weight" and "loss" both appear slightly more often in Women's Health than Men's Health, the word "pounds" is much more common in Men's Health, suggesting that in those cases where "weight" is used less than it is in Women's Health, that might be because it is replaced by the equivalent "pounds." More striking is the fact that "keto" is so much more prominent in Women's Health, suggesting that the women-targeted publication might feature diet trends more heavily than that targeted to males.
This roughly tracks with the line graph at bottom right, which shows that Men's Health features the word "fitness" much more often than does Women's Health. This suggests that Women's Health might put more focus in general on diet than exercise. However, this does not mean that Men's Health does the opposite - in fact, the same graph reveals that Men's Health features the word "eating" just as much as Women's Health.
More dramatic disparities occur in the other two graphs. The graph at the bottom left shows the word "sex" appearing much more frequently in Men's Health than in Women's Health, while the word "beauty" - common in Women's Health - never appears in Men's Health. This can possibly be explained by the fact that "beauty" is a word more commonly used to describe a woman's appearance than a man's (the second definition of "beauty" in the Oxford English Dictionary is "A beautiful woman"). "Sex," however, is not typically a gendered term, and the fact that it appears so much more in Men's Health than in Women's Health suggests that the possibility of sex might be considered by the writers as more of a motivating factor for male weight loss than for female weight loss - the latter might be considered to be more motivated by the possibility of beauty, given the information in this graph.
The upper right graph also shows a sharp disparity between the high incidence of "believe" in Men's Health and its absence in Women's Health, which instead has a high incidence of "want." Turning again to the Oxford English Dictionary, the word "believe" means "Accept that (something) is true, especially without proof," while the word "want" means "Have a desire to possess or do (something); wish for." Thus the language targeted towards men emphasizes accepting without proof, e.g. believe you can lose the weight, while the language targeted towards women emphasizes desiring possession of something, e.g. I want a thinner figure.
These four line graphs reveal some of the differences between the weight loss language targeted towards men versus that targeted towards women in 2019. But was gender significant in any of the turn of the century texts?
Unfortunately, none of the HTRC Analytics algorithms directly compare the components of a corpus in this way. And even if they did, the books in the corpus appear to be non-gender specific (with the possible exception of Lose Weight and be Well: The Story of a Stout Woman Now Thin.)
However, the InPho Topic Modeler might allow for some examination of gender. These topic sets provide "both a predictive model of future text and a latent topic representation of the corpus" (Chang, 1). In other words, looking at this list of sets can give the reader a quick sense of what topics this collection of texts covers and how it covers them. From this we might infer that looking at the words bundled with gender terms could allow us to make some assumptions about the language associated with those genders.
Woman:
Topic 18: age, medium, ounces, dieting, hips, minutes, woman, another, neck, rest
Topic 37: woman, tea, old, left, double, average, cucumbers, habit, beauty, four
Men:
Topic 10: egg, lemon, pineapple, table, always, half, about, men, make, wish
Topic 43: method, form, onehalf, cover, prunes, men, side, information, vance
"Woman" appears with age terms ("age," "old") as well as body parts ('hips," "neck") and size-associated terms ("dieting," "ounces," "medium"). As in the Trends graph of Women's Health, the word "beauty" is also associated with women.
"Men" appears most often with fruits ("lemon," "pineapple," "prunes") and scholarly terms ("method," "information"). The word "wish" is also associated with men, which mirrors the word "believe" that shows up in the Men's Health Trends graph.
While the commonalities with the Trends graphs are particularly suggestive, further investigation suggests that meaning-making analysis of topic models should be approached with caution. For example, according to the HTRC Analytics token count the word "man" appears 143 times while the word "woman" appears 123, yet "woman" appears a couple times in the topic model and "man" does not appear at all, only "men." Given this, it is possible that the words chosen and bundled tell us more about the algorithm itself than they do about the texts in the corpus.