Past projects

Language complexity meets discourse analysis: exploring the interplay of complexity and subjectivity

in collaboration with Maite Taboada

This project marries corpus-based quantitative methodologies to discourse analysis by investigating the relationship between text complexity and subjectivity as descriptive features in opinionated writing. The specific interest is on how the complexity of subjective text types interacts with discourse features (e.g. argumentative markers, sentiment words, modals) customarily used to characterise these text types.

The database is the Simon Fraser Opinion and Comments Corpus (SOCC) which comprises opinion articles and their corresponding reader comments from the Canadian online newspaper The Globe and Mail.  To explore the interplay between different levels of text complexity and  various markers of subjectivity, we employ conditional inference trees and random forests. Text complexity is assessed in terms of Kolmogorov complexity which measures the complexity of a text by the length of the shortest possible description of this text (Ehret 2018; Ehret 2017). In this project, we analyse complexity at the overall, morphological and syntactic level. Subjectivity is defined here as the linguistic expression of evaluation and opinion in language (e.g. Hunston and Thompson 2000) and is operationalised as the frequency of lexico-grammatical items which are used to convey subjectivity. Based on the extensive literature on the topic (e.g. Wiebe et al. 2004, Martin and White 1995, Biber and Finegan 1989, Halliday 1985), this set of subjectivity and argumentation markers  comprises evaluative words, stance adverbials, connectives and modals.


Funding:

Are online comments like conversation? Defining a new register

in collaboration with Maite Taboada

This project is an exercise in multi-dimensional analysis and takes an interest in the textual properties of online news comments as sampled in the Simon Fraser Opinion and Comments Corpus (SOCC). Web-based communication is ubiquitous, pervasive and controversial. It is rather well-studied in terms of its effects on media and social behaviour/society, verbal abuse and civility (e.g. Wolfgang 2018, Clarke and Grieve 2017, Kolhatkar and Taboada 2017, Rösner et al. 2016).

In this study the focus is on the question of whether online news comments are, after all, like face-to-face conversation - or not. Some editors and authors refer to online comments as "dialogue" (McGuire 2015) or  "online conversations" (Woollaston 2013), even some researchers characterise comments as conversation (e.g. Napoles et al. 2017, North 2007, Godes and Mayzlin 2004). Yet, these assumptions lack empirical back-up.

We are the first to systematically explore register-relevant properties of online news comments using multi-dimensional analysis (MDA) techniques (Biber 1988). Specifically, we apply MDA to establish what online comments are like by describing their linguistic features and comparing them to traditional registers (such as face-to-face conversation, academic writing, letters), and web-based registers (such as blogs, review, advice).


Funding: