Copilots for Linguists
This is the website for Copilots for Linguists: AI, Constructions, and Frames. 2023. Cambridge University Press.
Authors: Tiago Torrent, Thomas Hoffmann, Arthur Lorenzi, and Mark Turner.
Write to the authors at copilotsforlinguists@gmail.com
This book outlines a research program: copilots for linguists. Linguists have historically used AI copilots. What is the current state of that practice, and how can we push that research program forward? Such a research program will develop very rapidly, with the advent of new machine learning techniques, new foundation models, new training of foundation models, and new APIs and plugins for existing copilots. We forecast the unification of Construction Grammar as a theoretical enterprise with the development of computational copilots derived from machine learning techniques. We have emphasized the all-important practice of prompt engineering for copilots. As human experts devise and refine prompts, templates for prompts, and scripts for prompting; as they inch forward with one gain after another, it would be useful to share a common forum for discussion. Accordingly, to serve all those who study copilots for linguists, we have established a moderated discussion group at copilotsforlinguists@case.edu, to which you may write to share reports of progress, prospects, your participation in the project, and your request for membership. We have additionally established this website, http://copilotsforlinguists.org, where we expect to post progress and updates.
Abstract for the Cambridge Element: AI can assist the linguist in doing research on the structure of language. In this Element, we illustrate this possibility by showing how a conversational AI based on a Large Language Model (AI LLM chatbot) can assist the Construction Grammarian, and especially the Frame Semanticist. An AI LLM chatbot is a text-generation system trained on vast amounts of text. To generate text, it must be able to find patterns in the data and mimic some linguistic capacity, at least in the eyes of a cooperative human user. We do not focus on whether AIs “understand” language. Rather, we investigate whether AI LLM chatbots are useful tools for linguists. We reframe the discussion from what AI LLM chatbots can do with language to what they can do for linguists. We find that a chatty LLM can labor usefully as an eliciting interlocutor. We present precise, scripted routines for prompting conversational LLMs.
Additional Materials for Chapters of Copilots for Linguists
Chapter 1
Chapter 2
Chapter 3
Prompts used in this chapter:
Hi, Assistant. The un-VERB construction which creates verbs such as e.g. “unbutton” or "untie" has a form that consist of "un" followed by a verb. The meaning of the construction is "reversal". Thus, "unbutton" means "to reverse a buttoning action" and "untie" means "to reverse a tying action." Please give 10 more examples of this construction.
Hi, Assistant. In the caused-motion frame, an agent performs an action on an object that causes it to move in a direction. “Throw” is a caused-motion verb. "Throw" evokes the caused-motion frame. "Throw" indicates that the subject of the verb performed an action, throwing, that caused an object to move in a direction. But the syntactic pattern "Noun Phrase - Verb Phrase - Noun Phrase - Prepositional Phrase" can also evoke the caused-motion frame. When it does, the verb can be intransitive. "She rolled the ball to me" does not mean that she actually rolled; instead, it indicates that she performed some action, whatever it was, that caused the ball to move in a direction in a rolling manner. "He slid the plate across the table to me" uses the intransitive verb "slide." In this sentence, the subject, "he," did not actually move across the table, but instead performed an action that caused the plate to move in a direction across the table, indicated by the prepositional phrase "to me." Please give 20 examples of intransitive verbs used in caused-motion clausal syntactic patterns, like "He floated the boat to me."
The XYZ construction, e.g. “Paul (X) is the father (Y) of Sally (Z)” or “Causation (X) is the cement (Y) of the universe (Z)” prompts us to evoke a frame indicated by Y and to blend it with the X-Z information. Please give 10 more examples in which the Y term evokes a conceptual frame to be blended with the X-Z information.
Give me a construction grammar analysis of Knock Knock jokes.
Name ten other discourse-level patterns that can be analyzed as constructions.
Give a detailed analysis of FORM and MEANING elements of the Apology Construction.
Give me an example of a full text that is created by the Apology Construction. Indicate which parts correspond to which FORM or MEANING component of the construction.
Give a detailed analysis of FORM and MEANING elements of the Wedding Toast Construction.
Give me an example of a full text that is created by the Wedding Toast Construction. Indicate which parts correspond to which FORM or MEANING component of the construction.
Chapter 4
Prompts used in this chapter:
Brazilian Portuguese has two constructions sharing the same formal pattern, namely one involving a noun phrase, a verb phrase and another noun phrase. One of those constructions is the Active transitive construction, where the first noun phrase is an agent who acts on the patient expressed in the second noun phrase. An example of such a construction would be the sentence "A cozinheira quebrou as nozes". The other construction is the Split argument construction, where both noun phrases are a patient for the action expressed in the verb. Usually, there is a whole-part relation between the first noun phrase and the second one. An example of such a construction is the sentence "A panela quebrou o cabo". Please give me ten additional examples for each type of constructions mentioned above. Make sure not to repeat the verbs in the example lists.
The examples created for the Split argument construction are wrong. Note that in "A panela quebrou o cabo", the noun phrase "o cabo" represents a part of the object indicated by the noun phrase "a panela". Another good example of the Split argument construction would be "A menina quebrou o braço", meaning that the action was not performed on purpose, but that the entity represented by "a menina" suffered the event in the verb "quebrou".
Provide the construction grammar analysis of the construction licensing expressions such as "shark-safe beach", "child-safe beach" and "tsunami-safe beach". Explain the form and meaning properties of this construction.
There is a contradiction in your analysis. You say that the X position conveys a potential danger. But you also say that children in "child-safe beach" are protected from a danger. Which is the correct analysis?
Chapter 5
Chapter 6
The code used to automatically generate prompts can be found at: https://github.com/arthurlorenzi/copilot-for-linguists
Prompt templates:
The semantic frame for "<frame name>" is defined as follows: "<frame description>". Core frame elements in this frame are "<core frame element names>". <Core frame element definitions>. Words evoking this frame are the verbs <verb lexical unit lists>, adjectives <adjective lexical unit lists>, <all other POS lexical units>. Please propose 10 additional words that evoke the "<frame name>" semantic frame. Present them as a JSON array.
Now list 10 unique words in <other language> that evoke this frame.
The semantic frame for "<frame name>" is defined as follows: "<frame description>". Core frame elements in this frame are "<core frame element names>". <Core frame element definitions>. Words evoking this frame are the verbs <verb lexical unit lists>, adjectives <adjective lexical unit lists>, <all other POS lexical units>. Propose a semantic frame evoked by words such as <words evoking another frame>.
Now, please propose semantic frames for other kinds of <frame name>. Present them as a table in which columns are "Frame Name", "Frame Definition", "Frame Elements", "Frame Element Definition", "Words evoking the frame".
The semantic frame for "<parent frame name>" is defined as follows: "<parent frame description>". Core frame elements in this frame are "<core parent frame element names>". <Core parent frame element definitions>. Words evoking this frame are the verbs <verb parent lexical unit lists>, adjectives <adjective parent lexical unit lists>, <all other POS parent lexical units>. The frame for "<child frame name>" inherits the frame for "<parent frame name>" and is defined as "<child frame description>". The "<child frame name>" frame has one core frame element, "<child frame element name>", which is defined as "<child frame element definition>". The "<child frame name>" frame is evoked by words such as the verbs <verb child lexical unit lists>, adjectives <adjective child lexical unit lists>, <all other POS child lexical units>. Now, propose other semantic frames inheriting the "<parent frame name>" frame. Present them as a table in which columns are "Frame Name", "Frame Definition", "Frame Elements", "Frame Element Definition", "Words evoking the frame".
Chapter 7
Prompts used in this chapter:
English has a construction named Inverted order, where the direct object of the verb can be used before the subject to indicate that the action performed by the subject and affecting the object was unintentional. An example of this construction would be "The vase Mary broke". Please provide 10 additional examples of such construction and explain their meaning.
Further considerations
LLMs on the English Comparative Correlative
The English Comparative Correlative has been studied in, e.g., Fillmore 1986 [Charles J Fillmore. 1986. Varieties of conditional sentences. In Eastern States Conference on Linguistics, volume 3, pages 163–182] and Hoffmann 2017 [Thomas Hoffmann. 2017. Construction Grammar as Cognitive Structuralism: the interaction of constructional networks and processing in the diachronic evolution of English comparative correlatives. English Language and Linguistics 21.2: 349–373. Cambridge University Press 2017 doi:10.1017/S1360674317000181.]
Weissweilier et al. 2022 [Weissweiler, Leonie, Valentin Hofmann, Abdullatif Köksal, & Hinrich Schütze. 2022. The Better Your Syntax, the Better Your Semantics? Probing Pretrained Language Models for the English Comparative Correlative. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. pages 10859 - 10882 December 7-11, 2022. Association for Computational Linguistics.
https://aclanthology.org/2022.emnlp-main.746.pdf]
ask not how an LLM can be a copilot for construction grammarian doing linguistics, but rather, whether an LLM can recognize instances of a fully worked-out construction, state its syntax, and understand its meaning. We have accordingly asked an LLM current as of 2023-04-28 about the English Comparative Correlative.
The following is an exchange on 2023-04-28 with ChatGPT from OpenAI, running the GPT4 model.
USER: Answer as if you are a Construction Grammarian. English includes a construction called the "Comparative Correlative" construction. An example is "the more you eat, the fatter you get," In its main version, its FORM consists of two clauses, C1 followed by C2, each beginning with "the" and including a comparative adjective or a comparative adverb. Other examples include "the bigger they are, the harder they fall," "the more you yell, the hoarser you get." In its typical version, C1 presents the CAUSE and C2 presents the EFFECT produced by that CAUSE. Please present a Construction Grammar analysis of the Comparative Correlative, as a FORM-MEANING pair, giving its FORM and then its MEANING. Please elaborate as much and as specifically as you can on this construction, going beyond the analysis provided in this prompt.
ChatGPT: As a Construction Grammarian, I would analyze the Comparative Correlative construction as a form-meaning pair with a specific syntactic structure and a certain semantic function. In Construction Grammar, language is seen as a collection of constructions, which are conventionalized form-meaning pairings.
FORM:
The Comparative Correlative construction can be schematically represented as follows:
the [COMPARATIVE-ADJ/ADV] X, the [COMPARATIVE-ADJ/ADV] Y
Where X and Y are constituents of the construction, and COMPARATIVE-ADJ/ADV represents a comparative adjective or adverb.
The structure is characterized by:
Two clauses, C1 and C2, which are usually linked by a comma.
Each clause begins with the definite article "the."
Each clause contains a comparative adjective or adverb.
In most cases, the constituents X and Y are pronouns, noun phrases, or verb phrases.
MEANING:
The Comparative Correlative construction typically expresses a causal relationship between two situations or events. It indicates that an increase or decrease in the quality, state, or quantity expressed by C1 leads to a corresponding increase or decrease in the quality, state, or quantity expressed by C2. In other words, the construction highlights a proportional relationship between the two clauses, where C1 serves as the cause and C2 as the effect.
The meaning can be schematically represented as:
MORE/LESS (C1) -> MORE/LESS (C2)
Variations in the construction may alter its meaning slightly. For instance, when the Comparative Correlative is used with negative forms, it conveys a sense of impossibility or extreme difficulty:
the less you know, the better
In this case, the meaning can be interpreted as: the lesser knowledge you possess about a particular matter, the better off you are.
In summary, a Construction Grammar analysis of the Comparative Correlative construction considers both its form and meaning. The form consists of two clauses, C1 and C2, each beginning with "the" and including a comparative adjective or adverb. The meaning typically conveys a causal relationship between the two clauses, where C1 is the cause and C2 is the effect, often highlighting a proportional relationship between the situations or events described by the clauses. Variations in the construction may result in slightly different meanings, such as a sense of impossibility or extreme difficulty.
USER: Now provide 10 examples of the Comparative Correlative in English, with as much variation in the details of the example as is consistent with the specification of the construction. For each, explain as fully as possible the intended meaning.
ChatGPT:
Example: The more you practice, the better you become.
Meaning: As you practice more (C1), your skills improve proportionally (C2).
Example: The less you sleep, the more tired you feel.
Meaning: A decrease in sleep (C1) leads to a proportional increase in tiredness (C2).
Example: The faster she runs, the sooner she finishes the race.
Meaning: An increase in her running speed (C1) leads to a decrease in the time required to finish the race (C2).
Example: The farther you go, the more you discover.
Meaning: As you travel greater distances (C1), you experience a corresponding increase in discoveries or learning (C2).
Example: The higher the stakes, the more intense the competition.
Meaning: As the stakes of a situation increase (C1), the intensity of the competition increases proportionally (C2).
Example: The less you know, the less you worry.
Meaning: A decrease in knowledge or awareness about a matter (C1) results in a proportional decrease in worry or anxiety (C2).
Example: The more complex the problem, the longer it takes to solve.
Meaning: An increase in the complexity of a problem (C1) leads to a proportional increase in the time needed to solve it (C2).
Example: The more crowded the room, the louder the noise.
Meaning: As the number of people in a room increases (C1), the noise level increases proportionally (C2).
Example: The earlier you arrive, the better your chances of getting a seat.
Meaning: Arriving earlier (C1) leads to a proportional increase in the likelihood of securing a seat (C2).
Example: The more ingredients you add, the more flavorful the dish becomes.
Meaning: As you add more ingredients to a dish (C1), the flavor of the dish increases proportionally (C2).
These examples showcase variations in the details of the Comparative Correlative construction while maintaining the overall structure of the construction. Each example expresses a causal relationship between two situations or events, with C1 serving as the cause and C2 as the effect. The construction highlights the proportional relationship between the two clauses, conveying the idea that an increase or decrease in the quality, state, or quantity in C1 leads to a corresponding increase or decrease in the quality, state, or quantity in C2.