"She Doesn't Look Presidential"
Analyzing Gendered Fashion in Politics
Analyzing Gendered Fashion in Politics
Past studies show that news media delegitimizes women candidates in multiple ways, including more frequently referring women candidates by the first name (Uscinski and Goren 2011) and overly focusing on their appearance (Cummings and Terrion 2021). Meanwhile, qualitative evidence describes women politicians in the US as having a “husbands, hair, hemline problem” (Duerst-Lahti 2005), though to our knowledge there has been no large-scale quantitative study documenting whether either news media or voters overly focus on women politicians’ personal lives or appearance. This project seeks to explore how commentary on women politician’s clothing and appearance dominates online discussions of these candidates. We focus specifically on the 2020 Democratic presidential primary due to the historically large number of women running in that race. Our dataset includes over 25,000 tweets mentioning these candidates and we use novel AI language classification methods to categorize tweets. We have developed a model using Hugging Face Chat API to classify text as either containing references to candidates clothing or not. We hope to provide insight as to how gendered rhetoric can affect voter’s assessment of candidates.
Introduction
The 2020 Democratic Party presidential primaries drew the largest number of candidates since the modern primaries originated in 1972, with twenty-nine declarations (Burns et al. 2020). Of the candidates to make it to the primaries, six of them were women, Senator Amy Klobuchar, Senator Elizabeth Warren, then-Senator Kamala Harris, Senator Kirsten Gillibrand, Representative Tulsi Gabbard, and author Marianne Williamson- the most ever in an American presidential election. Many of these women were noted to be well-qualified as presidential candidates and had high name-recognition among Democratic primary voters (Hora 2019). With this record number of candidates, it seemed as though there might be potential to elect the nation’s first female president, following the defeat of Hillary Clinton’s campaign to Donald Trump in 2016. This also set the stage for the media and public to focus on the gender dynamics between the candidates. Data from recent analyses of media coverage show that both online and print media coverage of women candidates relied on gender stereotypes (Cassesse et al. 2022; Zurbriggen and Vallerga 2022). More specifically, there have been several qualitative studies of news coverage focusing on the clothing and appearance of women candidates, often using Hillary Clinton as a case study (Cummings and Terrion 2021; McGinley 2009; Miller, Peake, Boulton 2010). However, to our knowledge there has not been a widespread quantitative analysis of either news coverage related to women candidates and their appearance or how online discourse of candidates differs by gender.
This begs the question: Do either political elites or the mass public focus more on clothing and appearance of women candidates? Furthermore, what are potential consequences of this discourse in terms of support for women candidates or women’s desire to seek political office?
We have conducted a preliminary analysis of over 25,000 tweets relating to the 2020 Democratic primaries. We find that the tone of tweets mentioning women candidates is overall more negative, though at this point we cannot yet determine if this is true of tweets specifically mentioning fashion, clothing, and appearance terms. At this stage of our analysis, we have used Hugging Face AI to create a model to classify tweets as either containing clothing terms or not. We have recently received a grant to conduct this analysis on a larger scale using the “openai” R package, which connects to ChatGPT’s API (Rudnytskyi 2023). Past studies have shown this method to outperform both Research Assistants and Amazon MTurk workers (Gilardi, Alizadeh and Kubli 2023). It is our hope that in addition to providing insight into dialogue on appearance and women candidates we can also further the advancement of use of artificial intelligence language models in quantitative text analysis. These methods are similar to handcoding used by research assistants but have been shown to be more accurate in recent studies (Giliardi, Alizadeh, and Kubli 2023) and can be performed more quickly on a larger scale. We now plan to use a similar model to classify tweets using the “openai” R package that connects to ChatGPT’s API to conduct this analysis on all 25,000 tweets in our dataset in conjunction with handcoding efforts as well.
Women Candidates and News Coverage
Previous research shows that voters are likely to engage in various forms of gender stereotyping when female candidates are on the ballot. For example, voters perceive women candidates to be less qualified than male candidates with similar qualifications (Paul and Smith 2008) while simultaneously holding women candidates much higher qualification standards than their male counterparts (Bauer 2019). Regarding the role of the news media in gendered candidate evaluations, races with at least one woman running have been found to receive more “candidate trait” media coverage than “candidate issue” coverage (Dunaway et al. 2013). Coverage of candidates in the 2008 presidential election showed that Hillary Clinton and Sarah Palin were both referred to by gender labels more often than their male counterparts (Meeks 2013), and that Clinton was referred to more often by her first name than Barack Obama, especially by male news anchors (Uscinski and Goren 2011).
When addressing the presumed focus on candidate clothing and appearance for women candidates, qualitative studies based on historical evidence and interviews with candidates show that journalists focus more on appearance of women candidates than male counterparts. This involves focus on clothing as well as hairstyle and physical demeanor, like whether a candidate seems tired. Some women politicians have even said in interviews that they strategically select clothing with the purpose of “blend in” (Braden 1996). More generally, coverage of women candidates may focus on whether said candidate appears “unladylike” (Braden 1996) or “too masculine” (Cummings and Terrion 2021; Messner 2007). As McGingley (2009) notes, women candidates may experience difficulties “performing gender” in a manner that is acceptable to either voters or the news media. She uses the example of Hillary Clinton in 2008 being criticized for seeming too masculine and not traditionally attractive, while Sarah Palin was seen as more traditionally feminine but criticized for seeming vain and for supposedly exorbitant clothing expenses. Notably, the challenges of “performing gender” are not solely confined to women, as Messner (2007) notes that men running for office feel pressure to appear sufficiently masculine to voters, highlighting that gendered expectations also apply to male candidates.
Overall, women candidates may have been treated differently in past election cycles due to the “novelty” aspect of women running for office (Duerst-Lahti 2006; Holt 2012). However, it is possible that as more women run for office, especially looking at the crowded 2020 Democratic presidential primary field (Hora 2019), journalists have become less preoccupied with the “husband, hair, hemline” trifecta. Notably, the extent to which discourse surrounding women candidates has changed is difficult to determine in the absence of widespread quantitative analyses on the subject.
Social Media and Political Discourse
As Wallsten (2014) notes, Twitter users are not representative of the American public at large. As of 2019, Twitter users are on average younger, more liberal, and more educated than the average American (Wojcik and Hughes 2019)[1]. Additionally, most users on Twitter are not content creators, and many users who do post are “retweeting” rather than producing original content (Yaqub et al. 2017). However, O’Connor et al. (2010) argue that is some cases Twitter sentiment is reflective of public opinion polls, using President Obama’s polling numbers as an example. Additionally, journalists have increasingly used Twitter to find and connect with potential sources (Broersma and Graham 2012) and monitor public discussion on the platform for news stories (Wallsten 2014), demonstrating the potential for social media platforms such as Twitter to influence discourse on mainstream news outlets. Furthermore, social media platforms such as Twitter have given average users the opportunity to become “agenda-setters,” though topic analysis shows that with many political topics it is political elites that have the most influence in agenda-setting (Xu et al. 2013).
While Twitter is not representative of the mass public, the influence of “content creators” and agenda-setters on these platforms arguably has real-world political impacts. For example, Wallsten (2014) notes that politicians can interact with activists and big donors on Twitter. These individuals, while not necessarily high-profile political elites or household names, still wield influence in presidential primary elections. As Masket (2020) notes, many Democratic Party activists supported Joe Biden early in the 2020 Democratic primary, even as Biden struggled in early states like Iowa and New Hampshire. This early support in the “invisible primary” phase of a presidential campaign is crucial to primary candidate survival (Arthur 1976). Notably, the invisible primary is a period in which political elites “vet” candidates for office to ensure that these candidates both reflect party values and have the ability to win a general election (Grossmann and Hopkins 2016). Masket (2020) notes that based on interviews with Democratic Party activists they were particularly concerned about nominating a candidate who had the ability to defeat Trump in a general election, and some of them said they were concerned voters would not support another woman on the general election ballot. Additionally, survey evidence shows that voters were less supportive of women candidates, particularly Elizabeth Warren and Kamala Harris, when primed to consider electability. Therefore, evidence shows that ahead of the 2020 Democratic primaries both voters and political elites were concerned about potential consequences of nominating a female candidate. Presumably, some of these political elites may have been leading social media discourse during key 2020 Democratic primary moments.
However, it is unclear if other considerations, like priming voters to consider clothing and appearance, would have affected voters’ support for women candidates in the 2020 primaries. Possibly, discussing candidate clothing and appearance would qualify for what Suhay, Grofman, and Treschel (2020) refer to as indirectly “complimenting or criticizing a candidate” (4), therefore delegitimizing these candidates in voters’ minds. It is also possible that mentioning candidate clothing and appearance inherently primes voters to consider a candidate’s gender, which in turn leads to voters questioning a candidate’s “electability” based on gender.
Prior to 2016 there was little opportunity to analyze discourse of candidates based on gender on social media. Though analysis of traditional media shows that news anchors referred to Hillary Clinton as a presidential candidate more by her first name (Uscinski and Goren 2011) and were more likely to use gender labels to describe her (Meeks 2013), we do not know whether this was also true of social media content creators. In one of the few large-scale social media studies that involve sentiment analysis on social media of a woman candidate, Yaqub et al. (2017) find that tweets mentioning Clinton were more negative than tweets mentioning Trump. However, this may have reflected the candidates’ own tweets, as Clinton’s tweets were overall more negative in tone than Trump’s. Therefore, it is difficult to determine if tweets referencing Clinton were more negative in tone due to her gender or other factors.
[1] Dinesh and Odabas (2023) note that since Elon Musk purchased the company and rebranded it as “X,” 60 percent of users indicated they had “taken a break” from the platform and many highly active users are also using the platform less frequently.
Data and Methods
Tweets for this analysis were collected from the “#Election2020” dataset, which includes over 600 million “U.S. politics -and election-related tweets” from December 2019 to June 2021 (Chen et al. 2020). For purposes of this study, we have only included tweets from dates of Democratic presidential primaries: December 19, 2019, January 14, 2020, February 7, 2020, February 19, 2020, February 25, 2020, and March 15, 2020. Retweets and duplicate tweets were removed from our dataset. Next, we extracted tweets that mention any of the candidates by their twitter handle. Approximately 1.3 million tweets were included in our initial dataset.
We ran Bing sentiment analysis on our initial dataset of 1.3 million tweets. The Bing lexicon contains 6789 words which are scored as having a value of either positive 1 or negative 1. Within this lexicon 6789 words are considered positive and 2006 are considered negative (Naldi 2019). Scores of each tweet are determined by the sum of each word comprised within the tweet. For example, in the tweets below, the tweet mentioning Elizabeth Warren has a score of four, while the tweet mentioning Bernie Sanders has a score of -1. First, we examine whether tweets mentioning women candidates are inherently more negative.
Table 1 displays results of a regression in which the outcome variable is each tweet’s Bing score and the explanatory variables are candidate gender, ideological extremity of candidate mentioned[1]. Control variables were added to account for the possibility that women candidates like Elizabeth Warren are more ideologically liberal, which may drive the tone of social media discussions, or that candidates who receive more news coverage inherently receive more negative tweets. Results of Table 1 show that all three variables are negative and statistically significant. Therefore, tweets mentioning woman candidates are inherently more negative in tone, even when controlling for ideological extremity and volume of candidate news coverage.
[1] Ideology has been adopted from the Database on Ideology, Money in Politics, and Elections, is which candidate ideology is estimated using candidate donor networks (Bonica 2015), and the total amount of coverage each candidate received in 2019 on cable news networks.
The terms in Table 2 have been identified using word embeddings, which are used to identify words with a similar meaning like “writer” and “author” even if they do not share a common lemma or stem (Grimmer, Roberts, and Stewart 2021). Rather, word embeddings extract meaning from words based on the context they appear in. To identify the words used, we started with two terms of interest, “clothes” and “hair” using a pre-trained embedding and identified most similar words that were still directly related to clothing and appearance. These processes created a corpus of over 25,000 tweets. To see if results for sentiment analysis hold when isolating tweets related to clothing and appearance, we run a similar regression in on our new corpus. Results show that while all three variables have a negative effect, only ideological extremity and candidate coverage are statistically significant.
Terms
Attire
Blazer
Boot
Clothing
Dress
Hair
Heel
Jacket
Lipstick
Makeup
Outfit
Pant
Shirt
Shoe
Skirt
Suit
Sweater
Wardrobe
Wear
Table 2: Search Terms for clothing and appearance
Next, we cross-check tweets in our clothing/appearance corpus to verify that each tweet describes what a candidate is wearing or looks like. In the examples highlighted, while the tweet mentioning Elizabeth Warren clearly is referring to what shoes she wears, the tweet mentioning Sanders is referring to his campaign t-shirts, not what Sanders himself looks like or is wearing. Additionally, the tweet mentioning Biden refers to the phrase “dressing up a turd,” which is a euphemism describing one trying to disguise something unpleasant rather than describing actual clothing. Such discrepancies underscore the importance of validating dictionary-based approaches to classify documents.
In our preliminary analysis, we have tested a similar method using the free Hugging Face Chat API and that “hugchat” Python package.
We used the following query to classify tweets:
"Assess text for clothing references: 1) Identify any clothing terms (e.g., 'shirt', 'dress'). 2) Analyze context: Are these terms describing clothing worn by the individual referenced, used metaphorically, or in another context? 3) Conclude if text contains: a) Direct clothing references, b) Clothing terms in different contexts, or c) No clothing-related terms. Return Judgement: [a,b, or c], Summary: summarize your thoughts."
Hugging Face Text Classification Examples Using “hugchat” Package:
I. Tweet: @tulsigabbard *shirt sales explode* Tulsi Gabbard
Judgement: [a] Direct clothing references
Summary: The text contains a direct reference to clothing, specifically "shirt sales exploding." This suggests that the topic of clothing is being discussed in relation to Tulsi Gabbard. Therefore, the judgement is [a] Direct clothing references.
II. Tweet: @four_t_five @cindyseestruth @joebiden trying to look like "joe hollywood" with the dark glasses is a joke. settle into the rocking chair on the back porch facing the junk yard. Joe Biden
Judgement: [b] Clothing terms in different contexts
Summary: The text contains a reference to "dark glasses" which could be interpreted as a clothing item, but it is not clear that it is being used to describe something worn by Joe Biden. Instead, it appears to be a metaphorical reference to the idea of "looking like a Hollywood star." Therefore, this text contains clothing terms in a different context.
III. Tweet: it was so much fun to see @petebuttigieg on @theellenshow! ♥ï¸â™¥ï¸ and loved the questions she asked that drew in @chas10buttigieg. so much â™¥ï¸ on ellen. Pete Buttigieg
Judgement: [c] No clothing-related terms
Summary: The text does not contain any clothing terms or references to clothing. It is focused on Pete Buttigieg's appearance on The Ellen Show and the positive experience of seeing him there. There are no direct or metaphorical references to clothing in the text. Therefore, the text contains no clothing-related terms.
Discussion
As Gilardi et al. (2023) note, tools like ChatGPT demonstrate the ability of researchers to incorporate AI language models into quantitative research of text data. Regardless of whether we find that social media commentary on clothing and appearance is disproportionately targeted at women candidates, we are optimistic this project will further studies on using AI language models in text classification.
Furthermore, potential discrepancies in clothing/appearance dialogue may impact voters’ perceptions of candidates and whether voters believe these candidates for whom coverage is disproportionately focused on appearance to be well-suited for office. Additionally, such dialogue may potentially affect support for women candidates via framing affects. Framing refers to subjects forming an opinion about a subject based on how it is presented. Framing does not necessarily provide subjects with new information, but may instead “guide” consumers of media content to think about what they already know in a way that emphasizes certain considerations over others (Chong and Druckman 2007). Therefore, frequent explicit mentions of clothing and appearance may lead to voters fixating on these factors as well.
Future Plans
Eventually, our goal is to conduct a similar analysis using cable news transcripts rather than election-related tweets. As Suhay, Grofman, and Treschel (2020) note, news coverage affects voter support of candidates, even if unintentionally, when journalists covertly compliment or criticize a candidate. Therefore, if we find that cable news commentators are priming viewers to consider women candidates’ fashion choices or other factors related to their appearance, this may delegitimize these candidates in the minds of these viewers. Additionally, focusing on what women candidates are wearing may prime viewers to consider candidate gender while focusing on what male candidates wear might have little to no effect. There is evidence that priming voters to think of certain considerations, like “electability,” results in voters being less supportive of women candidates (Bateson 2020; Masket 2020). Therefore, it is possible that drawing attention to clothing and appearance could make voters feel that women candidates are less competent or qualified for office. Further research could use a survey experiment with vignettes to determine how clothing and appearance coverage affects perceptions of candidates.
We expect that conducting a similar analysis via transcript data may pose some difficulties, given that transcripts are inherently longer text documents than tweets. However, Gilardi et al. (2023) were able to extend their analysis to news articles as well as tweets and found ChatGPT to be similarly effective in some circumstances. If implemented correctly, we expect that extending this analysis to transcripts from hour-long news programs will provide additional evidence that researchers can use AI language models effectively in text classification more quickly and at a lower cost than more traditional methods.