Proceedings

Paper (🚧 we are planning to upload the papers asap)

Automatic Emotion Experiencer Recognition (Maximilian Wegge) [Paper]

The most prominent subtask in emotion analysis is emotion classification; to assign a category to a textual unit, for instance a social media post. Many research questions from the social sciences do, however, not only require the detection of the emotion of an author of a post but to understand who is ascribed an emotion in text. This task is tackled by emotion role labeling which aims at extracting who is described in text to experience an emotion, why, and towards whom. This could, however, be considered overly sophisticated if the main question to answer is who feels which emotion. A targeted approach for such setup is to classify emotion experiencer mentions (aka "emoters'') regarding the emotion they presumably perceive. This task is similar to named entity recognition of person names with the difference that not every mentioned entity name is an emoter. While, very recently, data with emoter annotations has been made available, no experiments have yet been performed to detect such mentions. With this paper, we provide baseline experiments to understand how challenging the task is. We further evaluate the impact on experiencer-specific emotion categorization and appraisal detection in a pipeline, when gold mentions are not available. We show that experiencer detection in text is a challenging task, with a precision of .82 and a recall of .56 (F=.66). These results motivate future work of jointly modeling emoter spans and emotion/appraisal predictions.

Personalized Intended and Perceived Sarcasm Detection on Twitter (Joan Piepi) [ Paper]

Sarcasm detection is a challenging task for various NLP applications. It often requires additional context related to the conversation or participants involved to interpret the intended meaning. In this work, we introduce an extended reactive supervision method to collect sarcastic data from Twitter and improve the quality of the data that is extracted. Our new dataset contains around 35K labeled tweets sarcastic or non-sarcastic, as well as additional tweets regarding both conversational and author context. The experiments focus on two tasks, the binary classification task of sarcastic vs. non-sarcastic and intended vs. perceived sarcasm. We compare models using textual features of tweets and models utilizing additional author embeddings by using their historical tweets. Moreover, we show the importance of combining conversational features together with author ones.

SpeakGer: A meta-data enriched speech corpus of German state and federal parliaments (Kai-Robin Lange) [Paper]

The application of natural language processing on political texts as well as speeches has become increasingly relevant in political sciences due to the ability to analyze large text corpora which cannot be read by a single person. But such text corpora often lack critical meta information, detailing for instance the party, age or constituency of the speaker, that can be used to provide an analysis tailored to more fine-grained research questions. To enable researchers to answer such questions with quantitative approaches such as natural language processing, we provide the SpeakGer data set, consisting of German parliament debates from all 16 federal states of Germany as well as the German Bundestag from 1947-2023, split into a total of 10,806,105 speeches. This data set includes rich meta data in form of information on both reactions from the audience towards the speech as well as information about the speaker's party, their age, their constituency and their party's political alignment, which enables a deeper analysis. We further provide three exploratory analyses, detailing topic shares of different parties throughout time, a descriptive analysis of the development of the age of an average speaker as well as a sentiment analysis of speeches of different parties with regards to the COVID-19 pandemic.

Multilabel Legal Element Classification on German Parliamentary Debates in a Low-Ressource Setting (Martin Hock) [Paper]

Parliamentary debates provide a broad overview of (legal) pieces of evidence for supporting or opposing the use of force by a state. If a state backs its practice by referring to a legal concept or the legal elements of that concept, the existence of a rule of customary international law (CIL) may be assumed. Traditionally, however, parliamentary debates have rarely been used as a source of CIL. We address this research gap with a joint approach that combines methods from political science, legal studies and natural language processing in order to ascertain the existence of CIL regarding the legal concepts of humanitarian intervention and responsibility to protect. We introduce a new framework and dataset to tackle the task of automatic legal elements classification LegalECGPD to analyse the use of force in german parliamentary debates. We performed multiple experiments in low-resource settings, showing the need of in-domain expertise and the existing limitations of supervised approaches when faced with tasks necessitating the interpretation of rich contextual information. Our resources are available under an open-source license for further research.

Bubble up – A Fine-tuning Approach for Style Transfer to Community-specific Subreddit Language (Alessandra Zarcone) [Paper]

Different online communities (social media bubbles) can be identified with their use of language. We looked at different social media bubbles and explored the task of translating between the language of one bubble into another while maintaining the intended meaning. We collected a dataset of Reddit comments from 20 different subreddits and for a smaller subset of them we obtained style-neutral versions generated by a large language model. Then we used the dataset to fine-tune different (smaller) language models to learn style transfers between social media bubbles. We evaluated the models on unseen data from four unseen social media bubbles to assess to what extent they had learned the style transfer task and compared their performance with the zero-shot performance of a larger, non-fine tuned, language model. We show that with a small amount of fine-tuning the smaller models achieve satisfactory performance, making them more attractive than a larger, more resource-intensive model.

According to BERTopic, what do Danish Parties Debate on when they Address Energy and Environment? (Costanza Navarretta) [Paper]

This paper investigates how two policy areas, \emph{Environment} and \emph{Energy} were dealt with by seven Danish left and right wing parties in their electoral manifestos (2007-2019) and parliamentary debates between 2009 and 2020. The main aim is to determine whether the topics discussed by the parties in the debates are the same as those addressed in the electoral manifestos, and whether the parties give the same weight to the two policy areas in the manifestos and debates. We both determine how often and for how long time the parties address the two policy areas in the two datasets, and we compare the topics addressed in the electoral manifestos and those generated by a topic modeling system, BERTopic. Both a multilingual and a Danish BERT model are tested. In our comparison, we take into account the relation between issue and party competition, the parties' profile and their being part of the government or the opposition, as proposed by Danish political scientists. Our comparison shows that only a few parties have a consistent behavior in the Parliament and in their electoral manifestos with respect to the topics that they address.

The UNSC-Graph: An Extensible Knowledge Graph for the UNSC Corpus (Stian Rødven-Eide) [Paper]

We introduce the UNSC-Graph, a knowledge graph for a corpus of debates of the United Nations Security Council (UNSC) during the period 1995-2020. The graph combines previously disconnected data sources including from the UNSC Repertoire, the UN Library, Wikidata, and from metadata extracted from the speeches themselves. Beyond existing metadata detailing debates' topics and participants, we also extended the graph to include all country mentions in a speech, geographical neighbours of countries mentioned, as well as sentiment scores. By linking the graph to Wikidata, we are able to include additional geopolitical information and extract various country name aliases to extend the coverage of country mentions beyond existing NER-based approaches. Studying mentions of Ukraine after 2014, we present a use case for the graph as a source for continuous analysis of international politics and geopolitical events discussed in the UNSC.

Deep Dive into the Language of International Relations: NLP-based Analysis of UNESCO's Summary Records (Emilia Wiśnios) [Paper]

Cultural heritage is an arena of international relations that interests all states worldwide. The inscription process for the UNESCO World Heritage List and the UNESCO Representative List of the Intangible Cultural Heritage of Humanity often leads to tensions and conflicts among states. This research addresses these challenges by developing automatic tools that provide valuable insights into the decision-making processes regarding inscriptions to the lists mentioned above. We propose innovative topic modelling and tension detection methods based on UNESCO's summary records. Through our analysis, we achieved a commendable accuracy rate of 72% in identifying tensions. Furthermore, we have developed an application tailored for diplomats, political scientists, and international relations researchers that facilitates the efficient search of paragraphs from selected documents and statements from specific speakers about chosen topics. This application is a valuable resource for enhancing the understanding of complex decision-making dynamics within international heritage inscription procedures.

Evaluating the Quality of the GermaParl Corpus of Plenary Protocols (v2.0.0) (Christoph Leonhardt) [Paper]

Parliamentary debates play a key role for the democratic process and for law-making. Scholarly interest in this material benefits greatly from the emergence of new datasets and corpora of parliamentary protocols. Here we combine the presentation of a second, extended version of GermaParl with an evaluation of the data quality of this corpus of plenary protocols in the German Bundestag. For this purpose, about 1 per cent of all protocols have been annotated manually to create a gold standard against which the structurally annotated corpus is compared. Results indicate that GermaParl can be considered a trustworthy resource for a broad set of research questions.

Abstracts

Do Arguments Migrate? Using NLP for Understanding Academic Debates (Jürgen Neyer)

A crucial question for academia is the relevance of arguments for scientific progress. Are participants in academic debates open to the arguments and insights of other authors, even if they are embedded in competing research paradigms? Or is discursive openness limited to intra-paradigmatic debates? What are the conditions under which arguments are migrating inside and across paradigms?

Arguments are central in social science. Arguments are used to make sense of complex data, challenge assumptions, and develop theories. They are often specific to certain theories and help distinguish between competing theories. But while arguments are often assumed to be theory-specific, an open question in science is under what conditions arguments migrate inside and across paradigms. The paper presents the research design and first findings from a four-year research project to build a social science artificial intelligence (AI) lab for research-based teaching (SKILL). Relying on computational linguistic and visual analysis of the corpus based on machine learning (ML) and natural language processing (NLP), the project aims to demonstrate the importance of arguments and how they are used in scholarly debates from the field of International Relations (IR) and political debates in the global realm. The project sets up the most extensive annotated text corpus available for international relations and trains an algorithm to recognise and qualify arguments according to their theoretical origin, supporting evidence and argumentative structure. It relies on an especially designed domain-level category system for the domain-level annotation and a simplified version of Toulmin’s argumentation model for the argumentation-level annotation.

Which is Bigger: Switzerland or Chad? Modeling Size Variation in Country Embeddings (Sara Bartl)

Countries can be ‘big’ or ‘small’ in various ways. We can think of a country’s literal size in terms of population or surface area. But countries can also be ‘big’ in a more figurative way. For example, a country can be ‘big’ in terms of economic or political power. One way to examine how words from a certain category vary conceptually with regard to a particular attribute is through word embeddings. For example, we can analyse how different countries are conceptualised in terms of their relative size by examining their position in a word embedding space in relation to ‘size’ words. Examining word embedding models from this perspective is important because it allow us to understand both how these models represent concepts and how these concepts are discussed in the real world.

In this study, we examine how various external factors predict a country’s relative ‘size’ in the GloVe model (Pennington et al. , 2014). First, we use semantic projection to determine the relative ‘size’ of a country as represented in the model. We construct a semantic scale representing the feature of ‘size’ following Grand et al. (2022). This dimension is constructed by fitting a line through the position of various ‘size’ words in the embedding space at either end of the scale: ‘big’, ‘large’, ‘huge’ at one end and ‘small’, ‘little’, ‘tiny’ at the other. We then obtain a ‘size’ score for each country in our dataset by orthogonally projecting its embedding onto this scale. The closer a country is to the ‘small’ end of the ‘size’ scale, the smaller its relative size given the model. Conversely, the closer a country is to the ‘big’ end of the scale, the bigger its relative size given the model.

Next, we examine the relationship between this semantic projection and three external factors related to the size of each of the countries in our dataset: population, geographical area and GDP. We find that all three factors show substantial correlations with the semantic projection, as well as with each other, with GDP surprisingly being an especially strong predictor. A regression model with these three predictor accounts for a moderate amount of variation in the semantic projection (R2=0.3568).

Although the regression model accounts for a substantial amount of variance in the word-embedding country scores, there is also a considerable amount of variance left unexplained.

We therefore further analyze the residuals of the regression model to identify other factors that may affect this conceptualisation, including possibly what can be interpreted as hidden sources of bias in the embedding space and by extension in the underlying training data.

We find that the macro-region in which a country is located also is an important predictor of a country’s relative ‘size’ in the embedding space (ADD NEW R-SQUARED). Specifically, European countries seem to be comparatively smaller in the GloVe embedding space, whereas African countries appear to be bigger in the embeddings than predicted by the other factors.

This study contributes to a growing body of literature that seeks to use meaning dimensions to study meaning representation in word embeddings and language models (e.g. Engler et al. 2022). Rather than seeking to identify bias in the language model, we seek to discover possible sources of literal and connotational meaning and holistically model their influence on variation in the embedding space.

What Can Go Wrong in Authorship Profiling for Demographic Prediction: A Systematic Error Analysis of Model Exclusion (Hongyu Chen)

Authorship Profiling (AP) has become a prominent task in recent years, aiming to identify an author's demographic characteristics through their writing style. While text categorization using stylometric features has shown high accuracy in predicting demographic attributes (eg. gender and age) in certain tasks, there are still cases where ML models and state-of-the-art models do not perform well. This brings the potential risks of marginalizing and misrepresenting certain demographic groups, ultimately giving rise to biases and discrimination. Thus, to gain a comprehensive understanding of the extent of sub-optimal models might exclude authors in certain demographic groups, this paper aims to shed light on the models' exclusion behavior in AP task through a systematic error analysis.

Extremism as Semantic Frames: Combining Paradigmatic and Syntagmatic Distributional Approaches for Contrastive Modeling and Visualization of Complex Social Constructs (Tim Feldmüller)

The topic of extremism has repeatedly become the focus of media coverage, not only with regard to repor-ng on individual incidents of violence, but also in the context of new protest movements. In addi-on to the classic forms of right-wing and leI-wing extremism as well as Islamism, possible new forms are also discussed. One example is the Querdenken movement that emerged in Germany during the Covid 19 pandemic. Yet extremism is by no means a narrow term that would allow for clear demarca-ons, but rather a very fluid concept that is constantly reshaped and has also been analyzed as "patchwork extremism" (Ackermann et al., 2015, p. 236). Further literature on German extremism discourse has shown that the framing of forms of extremism is also changeable in the process and that, for example, the poten-al for violence of right-wing extremism is only reflected in Spiegel Online coverage aIer the Na-onal Socialist Underground was uncovered (Czulo et al., 2020).

How the concept of extremism is constructed in the media is the subject of this s-ll ongoing research project. A large corpus (1.3 billion tokens) of ar-cles from the newspapers Taz, Spiegel, and Welt covering the period 1999 - 2021 serves as the data basis.

The project develops a data-driven method that models conceptualiza-ons of extremism as seman-c frames. The frame model used by Busse (2012) represents a synthesis of various slot-filler models, including those of Fillmore, Minsky, Barsalou, and Schank & Abelson. Frames in this model exhibit slots - in the context of extremism e.g. for actors (Bin Laden, Beate Zschäpe..), ac-ons (commidng aeacks, demonstra-ng...) or aeribu-ve ascrip-ons (cruel, an--cons-tu-onal...) - that are filled with specific values in concrete texts. Furthermore, frames are recursive, i.e. each slot and filler is itself a frame.

Not only because of the advantages in terms of impar-ality and the poten-al to discover the unexpected in the data (Bubenhofer, 2009), but also because of the vola-le conceptual character of extremism, a data-driven approach is par-cularly suitable. To this end, word embedding models (WEM) are first computed for the individual subcorpora (diachronic: three year slices and one 1 1⁄2 year slice, newspaper contras-ve: one subcorpus per newspaper) and then clustered using k-means. With a k-value of 2.4% of the vocabulary size, the clusters formed can be interpreted as frame elements of an extremism frame. Table 1 shows an example of three clusters iden-fied for the period 2020 - 2021 that represent extremist ac-ons, reac-ons to extremist aeacks, and characteris-cs of right-wing extremism.
While the clusters formed in this way are based on paradigma-c distribu-onal similarity, the interac-on of the clusters can be captured via their syntagma-c rela-onship. For this purpose, colloca-ons between the clusters were calculated using PolmineR (Blaeee & Leonhardt, 2020) respec-vely the Corpus Workbench (Evert & Hardie, 2011). Subsequently, the results of both methods can be visualized as a graph, with the clusters ac-ng as nodes and the colloca-on values as edges.

The goal here is not to exclude hermeneu-cs from the methodology. On the contrary, the clusters and their rela-ons to each other require interpreta-on. However, the formed categories (frame elements) emerge from the data with the help of established corpus linguis-c procedures and are interpreted only aIerwards (cf. similarly Kupietz & Keibel, 2009; Vachková & Belica, 2009). Furthermore, the inves-ga-on of a very large amount of data and a wide range of (lexical) possibili-es of verbaliza-on is made possible.

As an example, Figure 1 shows the Hanau frame as part of a larger right-wing extremism frame of the years 2020 - 2021. The connected frame elements show how the events of the aeack are cons-tuted as a right-wing extremist terrorist violent event. The proposed method

thus allows an explora-ve, also visual, access to the "new, large and complex units of meaning" (Tognini-Bonelli, 2001, p. 101) that result from the distribu-onal interplay of words and can show, via contrast, how frames and framings in the field of extremism change depending on -me and media actor.

Text Scaling using Word Embeddings over Time and for Various Actors (Bastiaan Bruinsma)

Political scaling aims to order political actors on a political dimension

according to their positions on them. Recently, one method to do so - the

automated text analysis of political documents - has started to use word

embeddings to improve its performance (Nanni et al. 2021; Rheault and

Cochrane 2020; Rodman 2019) . This has allowed scholars to move beyond

several traditional assumptions – such as the bag of words and provide new

perspectives. Here, we aim to build upon this promising development in three

respects: time, documents, and applicability. For the first, we look at how

and if we can address the drawbacks that come with using embeddings over

time – most often caused by semantic and topical drift. We investigate two

methods - embedding all documents in the same space or different slices -

and compare our results against a set benchmark (Jolly et al. 2022). For the

second, we look at how well these embeddings can scale actors other than

politicians - such as parliamentary committees. Given the more technical

nature of such committees, this allows us to test the embeddings in a more

challenging setting. For the third, we test how useful the positions generated

by the embeddings are in practice. For this, we use the electoral aid known

as Voting Advice Applications (VAAs) as a use-case. As VAAs aim to match

political actors and voters by using their positions in a political space, we

need valid and reliable positions. Also, as they are often constructed in haste

before the elections based on limited knowledge of the parties’ positions,

accurate positioning is often costly. As such, we aim to see how well this

word embedding-based positioning performs on such limited data. In all,

these three points can help us to better understand the promises and pitfalls

of using word embeddings and give suggestions for further development.

Towards Causal Explanations for Text Classifiers - Applications in Political Science (Denitsa Saynova)

We have previously studied how explainability methods for neural text classifiers can be extended from the instance level to the class level, providing clues about the features that are descriptive of the class as a whole. Here we outline an extension of this method, to also bring causal reasoning into the picture.

Using word embeddings to quantify gender bias in political speech (Helena Heberer)

How can we use legislative speech to make statements about gender bias in political parties? Natural language processing has found many useful applications in political science, particularly in models based on words embeddings. However, word embeddings tend to reproduce cultural biases that are often inherent to the training data, among them a significant gender bias. This can be a problem if uncorrected. But this methodological idiosyncrasy can also be leveraged to quantify and measure bias in text data. Some recent publications have generated interesting insights into making effective use of this in the context of political science. I build on this latest research and use word embeddings to examine gender bias in parliamentary speech. This is a promising application of the approach that has not yet been explored. I am particularly interested in the empirical link between gender bias and the representation of women – for example, are parliamentarians’ speeches more biased if few women are represented in their party, what change occurs under female party leadership, and can we make statements about parties based on the level of gender bias?

Detecting left and right-wing populism in the German Bundestag (Lukas Erhard)

The “rise of populism” concerns many political observers and has been linked to, for instance, increasing economic anxiety or ideological polarization (Mudde & Rovira Kaltwasser, 2018). The prevailing scholarly view states that the constituting dimensions of populism are anti-elitism and people-centrism, and that both must be present to consider a text to be populist (Dai & Kustov, 2022). In addition, the “thin ideology” of populism is often attached to a “thick ideology”.

To train a model that detects and distinguishes these dimensions, we created an annotated dataset based on the 18th and 19th legislative periods of the German Bundestag. Our final dataset comprises 8795 annotated sentences, each labeled by 5 coders with a background in political science. Out of these, at least one annotator classified 3236 (36.79%) as anti-elite, 1608 (18.28%) as people-centric, 1393 (15.84%) as left, and 773 (8.79%) as right. As expected for an ambiguous and subjective concept like populism, the level of agreement among coders is relatively low. Consequently, we are conducting a comparative analysis of multiple approaches regarding human-label variation (Plank, 2022). We will also provide the unaggregated data to the community.

We treat our classification problem as a multilabel problem. In doing so, we can use all available samples for detecting populism and learn the ideology independently of it. Our classifier achieves the following F1-scores: F1_macro = 0.75; F1_elite = 0.85; F1_pplcentr = 0.70; F1_left = 0.72; F1_right = 0.72. Due to the lack of a substantial ground truth, we perform a battery of validation checks. We find that the predictions (i) exhibit a high correspondence (84%) to reported examples (Ernst et al., 2017; Schürmann & Gründl, 2022), (ii) rank parties according to expert surveys (Jolly et al., 2022) when aggregated, (iii) clearly outperform existing approaches, and, finally, (iv) reveal a high face-validity for each dimension when inspected qualitatively.

Adapting Named Entity Recognition to Climate Change Data (Siyao Peng)

In this abstract, we present ongoing work on NER adaptation to the new CC domain.

We focus on CC NER adaptation through corpus construction and modeling experiments. We collect a new Climate-NER corpus with 130K tokens balanced across five genres (academic articles, IPCC reports, web news, Wikipedia articles and YouTube transcriptions) and two languages (English and German). We follow CoNLL (Tjong Kim Sang and De Meulder, 2003) and GermEval (NoSta-D, Benikova et al. 2014) to annotate the four basic entity types: PER, ORG, LOC, and MISC to facilitate cross-domain adaptation. We also add -deriv(ation) and -part tags for each entity type to accommodate long compounding in German. On the other hand, motivated by CrossNER (Liu et al., 2021), we design several CC-extended labels – such as EVENT, CHEMICAL, CAUSE, and CONSEQUENCE – for capturing peculiar entities in the CC domain. For holdout testing, we will include CC tweets posted around the COP28 conference as a new genre and to introduce unseen CC terminologies. Source texts (except for Twitter) and annotations will be made publicly available.

Our CC NER experiments are conducted in three settings. Firstly, we examine in-domain ClimateNER training and report SOTA models’ performances on the basic and CC-extended tagsets. Secondly, domain-adaptation strategies requiring the same source/target tagsets and strategies that can accommodate diverse tagsets are applied to the basic and CC-extended tagsets and compared against the in-domain baselines. Lastly, we examine these in-domain and domain-adaptive models on the holdout Twitter test data and analyze the interaction between domain and genre adaptation. We foresee that the Climate-NER corpus adds a new dimension to NER domains, and our three-fold experiments give insights into the interplay among domains, genres, and languages in NER model adaptation.

A topic network analysis of the Becker-Posner blog, 2004-2014 (Morten Luchtmann)

In 2004, the American economist and Nobel prize holder Gary Becker started working with the legal scholar and federal judge Richard Posner on a common blog, in which they positioned themselves as public intellectuals and discussed various political and economical issues for a larger audience (Fleury and Marciano, 2012). Throughout this 10-year collaboration, the authors alternately commented on a given theme every week, often linked to the current events, and they sometimes also replied to user comments. In this work, we describe Becker and Posner as public intellectuals by analyzing how the two scholars differ in their use of language on the blog given their respective careers. Using their blog as a dated corpus of documents, we observe how the 2008 financial crisis, which sparked a surge of interest in economics (Mata, 2011), affected the discourse on the blog. We also try to document whether and how the two scholars differ in their use of language. The aim is to characterize their political capacity as agenda setters and their agency as social commentators during a period of economic stress.

We first use the topic modeling method BERTopic (Grootendorst, 2022) to obtain two individual topic models, trained respectively on all of Posner’s and all of Becker’s posts. We then propose a method called topic networks, in which we compare similar but distinct topic models through the use of a graph construction. We first connect topics with their associated words and with other similar topics in the embedding space. We then cluster the resulting graph using a community detection algorithm (Blondel et al., 2008) and apply a graph layout algorithm (Jacomy et al., 2014) in order to obtain a detailed, human- readable overview of the blog as a collection of areas of discourse by Posner and Becker. This construction enables us to quantitatively compare Posner’s and Becker’s use of language throughout the blog. Generally, we detect more topics for Becker, and observe that he makes more use of topics related to labor as well as food, energy and natural resources. In comparison, Posner makes a wider use of topics related to public finance and cost-benefit analysis.

In a second part, we measure the frequencies of occurrence of identified clusters over time, and we use these results to deduct that the financial crisis had a disrupting and lasting effect on Posner and Becker’s agenda and on the main topics of the blog. We also split our corpus into a pre-crisis, a crisis and a post-crisis period, and show that each of the authors’ topics follow one of four types of temporal evolution (balanced use, crisis peak, post-crisis increase and post-crisis decrease). This method also allows us to highlight further differences between Posner and Becker.

Finally, we give a brief overview of the nature of the commenting audience, which seems to be mainly made of male users, and discuss the relationship of the interaction between the anonymous commentators and the two bloggers.

Portrayals of Income Inequality in the UK House of Commons (Michael Webb)

Income inequalities are on the rise across advanced economies. Even as developmental differences between countries shrink, differences within countries continue to widen. There is evidence that perceptions of income inequality influence individual’s preferred policy outcomes. Yet, these studies have not typically focused on how income inequality is connected to other forms of inequality and other problems in society in the popular discourse. Individual’s awareness of these connections may influence their ability to perceive the impact of inequality in their daily lives, which in turn strongly influence’s their general understanding of inequality and their preferences for how it is addressed. Thus, having a clear understanding of how income inequality is connected to other topical areas in a country’s political discourse may be an important step in understanding why preferences around inequality differ across countries (and potentially across regions within countries).

To that end, as a preliminary step in this investigation, we employ Chat GPT to assess how members of parliament connected income inequality to other issues during their addresses to Westminster Hall in 2022. These addresses allow MPs to raise awareness of issues that they perceive as especially important. These are moments of discursive, informal agenda setting, which offer insight into how MPs hope to frame issues. In this sense, they are a useful starting point for understanding how inequality is (and is not) thematically connected to other issues in elite discourse. We draw text from the UK’s Hansard API and then interface with Chat GPT’s API, prompting the model to assess what topics are being addressed in the MP’s speeches and whether there are explicit connections being drawn to the issue of income inequality.

As each address is linked to a specific MP, we also aim to contrast the thematic connections between income inequality and other issues across parties as well as regions.

Attitude identification in diplomatic speeches: challenges in the annotation process (Mariia Anisimova)

This study provides an overview of challenges that occurred during the creation of the annotation scenario for the attitude annotation in diplomatic speeches of the UNSC.

Diplomatic speeches form a very peculiar group of texts that are different from other types of discourse. The prominent characteristics of these texts are the understated tone (Stanko, 2001) and indirectness. These pragmatic features prove to be important to how diplomats express opinions, which are most frequently not of their own but of the political body they represent. It is also because of them, that the diplomatic attitudes are so particular and require their own approach in the process of annotation.

The research on annotating attitudes has long been an ongoing process (Fuoli, 2018), (O’Donnell, 2013) and is generally perceived as a complex procedure not only due to elaborated annotation schemes but more so due to the lack of definitive criteria for the identification and categorization of attitudes and the other appraisal labels.

Our annotation scenario followed the attitude part of the Appraisal theory (Martin and White, 2005). The various challenges in annotating the speeches such as the extent of arguments, identification of attitude in verbal forms, and complex structures, were classified, and, in part, resolved. The conclusions of this study would be helpful for anyone considering this type of attitude analysis for working with diplomatic texts.