Voice, transitivity, diathesis and valency are all notions that have been defined in many different ways during the history of linguistics. In my talk I will give an overview of these different ways, and try to propose a coherent framework for studying these notions. The purpose of my talk is to propose a cross-linguistically valid framework for studying these notions.
Valency refers to the number of arguments a predicate takes, and the valency of verbs varies from 0-3. Transitivity can also be viewed as the number of arguments present in a clause, but transitivity is usually viewed as a multilayered notion that also comprises features such as affectedness, agency and definiteness. In this view, highly transitive events involve an active, volitionally instigating agent that targets its action at a patient that is directly and thoroughly affected by the denoted event. For example, ‘The teacher broke the vase” is a highly transitive event, while ‘John likes beer’ is not.
Diathesis refers in general to the ways in which arguments are referred to. For example, in the clause the child broke the vase, the subject refers to an agent and the object to a patient. while in the vase broke, the subject refers to a patient, i.e. the diatheses of these sentences are different. Voice also concerns how semantic roles are referred to, but with the very important difference that we are talking of different voices only when the changes in diathesis are signaled on the predicate. For example, passives are viewed as voice in English, but reflexives are not, because there are changes in the verb morphology only in passives.
Moreover, different definitions of voice differ according to which valency-changing/argument marking modifying categories are viewed as instances of voice. Traditionally, passives and middles are viewed as voice, but more recently, also causatives and applicatives are more often taken into account when discussing voice.
In my talk, I will discuss the different notions noted above in more detail in light of data from many different languages. I will also illustrate different ways of defining the notions (e.g. differences in defining voice in Klaiman 1991, Kulikov 2010 and Zúñiga and Kittilä 2019), and some problems related to this. The goal of the talk is to propose a coherent taxonomy of defining the notions discussed in the talk.
In my talk, I will discuss a typology of different voice categories largely in light of Zúñiga and Kittilä (2019). In addition to proposing a typology, I will illustrate the relevant categories in light of examples from a variety of different languages.
First, voice categories can be divided into two based on whether they affect the argument structure or not. Zúñiga and Kittilä (2019) make the division between argument-structure modifying and argument-structure preserving operations. Examples of the former type are illustrated by causatives, applicatives, anticausatives. Examples of the second type are illustrated by, for example, passives and antipassives. These two types differ from each other in that the former case, the underived and derived constructions describe different non-linguistic events, whereas, for example passive and antipassive can describe the same event. From this it also follows that argument-structure preserving operations may also have syntactic functions, which is not possible for argument-structure modifying operations.
Moreover, voice categories can be divided into nucleativizing and denucleativizing operations depending on whether the operations introduce an argument into the clause core, or rather remove an argument from the core. Nucleativizing operations install a new argument to the semantic structure of the clause as a syntactic core agument (as subject or primary/direct objects). Examples of this are illustrated, for example, by applicatives and causatives, of which the former installs a new argument as a direct/primary objects, while causatives install a new A as a subject. Anticausatives and antiapplicatives, in turn, are examples of denucleativizing operations. Both of these types can be divided into subtypes based, for example, on whether the installed/removed argument is a an agent or a patient. In addition to the types discussed so far, languages may have covert diatheses, which are not marked on the verbal predicate, which are labelled as covert diatheses by Zúñiga and Kittilä (2019). Examples are illustrated, for example, by labile verbs, that can appear in an intransitive and a transitive constructions without any changes in the verbal morphology, as in the English examples the child broke the vase vs. the vase broke.
In my talk, I will illustrate the above-mentioned operations in light of data from an array of geographically and genealogically diverse languages. I will also discuss the rationale behind the typology, for example, why some operations are more common across languages and more productive within languages. For example, causatives are much more frequent across languages than applicatives and antiapplicatives, and within languages, causatives are more productive than anticausatives. As the discussion will show, there are good reasons for both of these.
Whether by choice or impulse, we perceive and construe a situation in different ways depending on the context and our emotional and attitudinal states. Voice is a linguistic means through which we express our perception and construal of a given situation. Among some of the better-known voice constructions are the passive voice constructions. Equally well-known but not readily recognized as voice constructions until recently are the causative constructions (Zuñiga and Kittilä, 2019). Another category of voice construction that has received some attention in recent decades is the ‘middle voice’ category (Kemmer, 1993), which comprises a variety of constructions that include the anticausative (e.g. spontaneous middle) constructions and even the reflexive constructions. In this talk, I will address the question of how voice constructions may evolve over time. This will help us better understand why, in some languages, the same voice markers may serve multiple voice (and sometimes also other related) functions.
I will first illustrate with an example from Odia, an eastern Indo-Aryan language, to show how the verb jibaa ‘go’ develop into a middle voice marker and then into a passive voice marker through its role as light verb galaa and jaae in past and non-past contexts respectively (Sahoo and Yap, forthcoming). I will then show a similar middle-to-passive extension in Korean, where the verb ti ‘fall’ developed into middle/passive voice suffix -eci (Ahn and Yap, 2017, 2021). In addition, Korean also has a voice suffix -i, with diachronic evidence suggesting a causative to middle and passive development (Yap and Ahn, 2019). I will close with examples from Sinitic languages to show how ‘give’ verbs are used not only in causative (trivalent) but also passive (bivalent) constructions, and in some varieties such as Mandarin and Southern Min, the semantically bleached ‘give’ verbs are also found in unaccusative (monovalent) constructions to express the speaker affectedness.
From the Indo-Aryan and Korean examples, we see evidence of valence increment as galaa/jaae and -eci constructions extend their presence from monovalent middle constructions to bivalent passive constructions. From the Korean examples, we also see evidence of valence reduction as suffix -i extends its contexts-of-use from trivalent causative constructions to monovalent middle and bivalent passive constructions. In the case of Sinitic languages, we see evidence of valence reduction where trivalent ‘give’ verbs extend into bivalent passive constructions, and in a rare development typologically, are seen in monovalent unaccusative constructions where the trivalent ‘give’ verb appears in monovalent constructions in some Sinitic varieties to implicitly signal the presence of an affected speaker, a phenomenon which Huang (2013) has described as a ‘phantom affectee’.
By looking at the diachrony of voice markers, we can better understand the relationship between voice categories, and also the ease with which our minds navigate and shift between various voice categories as we construe and re-construe the world and our (inter)personal experience through language.
Languages provide various ways of describing situations, and in particular, of describing the relations between the participants in those situations. Because of this, using language often implies making a choice, for example between the active and passive voice: The nurse vaccinated the child vs The child was vaccinated by the nurse. Cognitive linguists use the theoretical concept of “construal” to account for these alternating ways of expression (Langacker 1987). They consider two grammatical possibilities for expressing one and the same situation as two different ways of describing and thereby “construing” that situation. In other words, the lexical and syntactic choices reflect a specific framing of the experience and, furthermore, a certain commitment to how that experience will be communicated between interlocutors. Language can thus be seen as a highlighting device that promotes or demotes the salience of various situational cues, which, in turn, modulates how we attend to those cues.
Most attempts to explain linguistic choices by appealing to alternative construals are based on the analysts' own intuitions about the data. In an attempt to put the concept of construal on a sounder, empirical footing I rely on the theoretical notion of ception (Talmy 2000), which conjoins the domains of perception and conception, making it a suitable starting point for an empirical investigation of the question whether linguistic encoding affects the way in which events are perceived and hence conceived by speakers of a language. Sixty university students and staff participated in a visual-world eye-tracking experiment run on an Eyelink Portable Duo eye-tracker (Divjak et al. 2020). Participants heard the description of a scene, in either active or passive, before being shown a full-coloured photograph depicting that scene. To analyse how differences in scene description affect scene viewing, several eye movement measures were extracted, including first fixation time (which allows us to determine the order in which the elements in a scene are accessed), first fixation duration (which picks up on the salience of any element(s) in the scene) and total dwell time (which provides an indication of overall processing effort).
A Generalized Additive Mixed Effects model, fitted to the relationship between eye movement measures and construction, revealed that the linguistic construal of a scene affects its spontaneous visual perception in two ways: either by determining the order in which the components of a scene are accessed or by modulating the distribution of attention over the components, making them more or less salient than they naturally are. I will zoom in on the findings for the voice alternation, and use them as background against which the claim can be assessed that language can affect visual information uptake and hence conceptualization of a static scene.