Co-Creative Artificial Intelligence: 

Theory, Methods, and Model

Abstract

Co-creative artificial intelligence (AI) is a process where a human collaborates with an AI agent on a shared creative product. This paper introduces a cognitive theory of co-creation that leverages the cognitive science paradigm of enaction to describe different types of sense-making processes and interaction dynamics of co-creation. The theory proposes that participatory sense-making, a social process of negotiating meaning, is a critical ingredient of co-creation. This paper reviews the co-creative sense-making (CCSM) framework, which is a data collection method for co-creative AI systems to quantify co-creation to aid in experimentation and analysis. The CCSM is an in situ data collection method encoded into co-creative systems making them quantified co-creative AI systems. Finally, a model of co-creation is presented that visually depicts turn-taking, improvisational offers, interaction couplings, and the cognitive state of the collaborators through time. The main contribution of this paper is presenting a unified theory, method, and model for co-creative AI. 

1. Introduction

Creativity is typically defined as a product or process that is novel and valuable (Sternberg & Lubart, 1999). Creativity has been interwoven in technological developments since the early days of computing (Boden, 1994). More recently, creativity support tools (CSTs) have become effective at helping people rapidly explore ideas, visualize multiple solutions, and perform complex simulations (Shneiderman 2007). Yet, many creativity support tools are not able to create original artistic contributions, such as adding a new line, poem stanza, or musical accompaniment to a shared artwork (Davis et al., 2014). Advances in artificial intelligence (AI), computational creativity, and robotics are enabling the development of co-creative AI agents, i.e. computer colleagues (Lubart 2005), that can collaborate with humans in a more naturalistic and improvisational manner.

There are co-creative systems in a wide variety of domains, such as drawing (Davis, 2015), design (Karimi et al., 2019; Ibarolla et al., 2022; Zhu et al., 2018; Kim & Maher, 2021), dance (Long et al., 2019; Trajkova et al, 2023), poetry (Kantosalo & Riihiaho, 2019), music (Biles, 2002; Huang et al., 2022), theatrical improvisation (Magerko et al., 2011), and game design (Guzdial et al., 2018; Margarido et al., 2022). Instead of replacing the creative individual, co-creative AI includes that user in a creative and collaborative dialogue (Bown et al., 2020). Here, the creative product emerges through interaction, rather than single-shot execution of algorithms. A critical element in co-creative interaction is the turn-taking that emerges between the user and system (). The interaction paradigm changes in co-creative interaction from input-execution to a turn-taking co-creation. We argue this transition from execution to co-creation represents a paradigmatic shift in computing. 


In the co-creation paradigm, the unit of analysis is the interaction dynamics and turn-taking between the user and system through time. The human-centered perspective of HCI recruits cognitive theories, such as distributed cognition (Hutchins, 2000) and situated action (Suchman, 2007) to help understand the user’s mental model, cognitive processes, situational context, social context, and intentional stance toward a system. The argument we propose is that the new paradigm of co-creative interaction with AI systems requires an additional research methodology that focuses on interaction dynamics, sense-making, and participatory sense-making between the user and system.


As a computing paradigm, co-creative AI has the potential to radically shift how we engage with technology. The transition to a more collaborative and dialogical interaction has already begun with the integration of large language models (LLMs) (Zhao et al., 2023) into a variety of domains, including conversational chat (Bang et al., 2023), programming (Chen et al., 2021), writing (Yuan et al., 2022), and education (Kasneci et al., 2023). However, we need new theoretical tools and frameworks to adequately study and design for this new paradigm of computing. In the co-creative paradigm, we must strive to understand the dynamics of interaction between the user and system through time in addition to their cognitive and sociocultural context that traditional human-centered approaches investigate. 


During the process of co-creation, participants engage in open-ended meaning construction that gradually builds through time. Given the open-ended nature of co-creation, it is difficult to quantify and evaluate. This paper proposes employing the cognitive science theory of enaction as a theoretical framework for understanding co-creation. Enaction provides the theoretical tools to investigate co-creation, such as sense-making, participatory sense-making, creative sense-making, and interaction dynamics. Based on the enactive cognitive science literature, a cognitive framework was derived for quantifying the interaction dynamics of co-creation called co-creative sense-making (CCSM). Within the CCSM framework, four categories of data—1) interaction dynamics, 2) cognitive dynamics, 3) collaboration dynamics, and 4) domain behaviors—collectively provide a quantified model of co-creation. When viewed together, these categories of data form a 'Gestalt' that indicates the level of co-creation achieved and provides insights into the intricate dynamics sustained throughout the interaction, offering a more comprehensive understanding of the co-creative process.



The main contribution of this paper is a three-part framework in the evaluation of co-creative AI: a unified theoretical foundation, a methodology operationalizing that theory, and a visual model representing key elements of the method. The theory is drawn from enactive cognitive science and existing co-creative AI literature and provides the basis for the model. This framework seeks to remove domain-specific limitations, offering researchers across interdisciplinary and multidisciplinary fields a valuable tool for quantitatively evaluating the collaborative dynamics of co-creative processes.


The remainder of this paper is structured into three main sections, further exploring and applying the CCSM framework. The first section develops a theoretical perspective of co-creative AI. The next section builds on the theory by describing the co-creative sense-making methodological framework for quantifying co-creation. The final section presents the enactive model of co-creation, which visually depicts the interaction dynamics. The paper concludes with a discussion of the potential applications of this approach to co-creative systems.

2. Co-Creative AI Theory

A working definition of co-creativity could start as follows: 


"Co-creativity is classified as multiple parties contributing to the creative process in a blended manner (Candy et al. 2002)... Co-creativity allows participants to improvise based on decisions of their peers. Ideas can be fused, and built upon in ways that stem from the unique mix of personalities and motivations of the team members (Candy et al. 2002). Here, the creative product emerges through interaction and negotiation between multiple parties, and the sum is greater than individual contribution[s]. These interaction principles can be extended to include a sufficiently creative agent that can collaborate with human users in a new kind of human-computer creativity" (Davis et al., 2014).


Co-creative artificial intelligence (AI) is an interdisciplinary field of AI that brings together HCI, AI, Machine Learning, Computational Creativity, Creativity Research, Collaboration Research, and Cognitive Science to enable the design, implementation, and evaluation of computationally creative systems that collaborate with users on shared creative products (Davis, 2014). This field bridges the academic domains of creativity support tools (CST) that accelerate the user's existing creativity (Gross, 1996; Gross & Do, 1996; Nakakoji, 2006; Shneiderman et al., 2006; Shneiderman, 2007; Johnson et al., 2009; Frick et al., 2019), and computational creativity systems that generate new creative products autonomously or facilitate human creativity (Colton & Wiggins, 2012; Maher, 2012; Colton et al., 2015; Jordanous, 2016; Grace et al., 2017; Carnovalini, 2020). 


The resulting systems are either a nanny, pen-pal, coach or computer colleague that act as partners in a co-creative workflow (Lubart, 2005). 


Kantosalo & Toivonen (2014) offer a formal definition of co-creation that is strongly rooted in the literature quoted below: 

"We use the term human-computer co-creation to refer to collaborative creativity where both the human and the computer take creative responsibility for the generation of a creative artefact. The term co-creation refers here to a social creativity process “leading to the emergence and sharing of creative activities and meaning in a socio-technical environment” (Fischer et al. 2005), but with the emphasis that the computer is, instead of only providing the socio-technical environment, also an active participant in the creative activities." (Kantosalo & Toivonen, 2014)

Kantosalo & Takala (2020) elaborate the earlier definition of co-creation to include the creative collective making contributions that get shared with a community and occurring in the situated context of the co-creators. 


Co-creative AI is uniquely positioned as a learning tool for novices to engage with authentic computing concepts (Freeman et al., 2014). Co-creative AI can be used in a variety of interactive art applications,  such as drawing, design, music, poetry, drama, and painting. There are a plethora of tools and technologies related to interactive art in co-creative AI, such as the Drawing Apprentice (Davis et al., 2014), EarSketch (Freeman et al., 2014), Shimon (Weinberg et al., 2009), and the Painting Fool (Colton et al., 2015). Co-creative AI can also serve as a tool to engage and inform the general public about artificial intelligence (e.g. AI literacy (Long & Magerko, 2020) and explainable AI (Ehsan & Riedl, 2020)) and its role in potentially augmenting their creative activities and vocations. 


While it is a critical component of human social cognition, co-creation presents some tough computational challenges that require interdisciplinary solutions, such as:



The situated nature of co-creation requires dynamically changing models of the environment. However, it is difficult to determine when and why to change this type of knowledge in a co-creative agent because there is no right answer, simply a prediction about how effective a contribution might be given the predicted state of the user and the interpreted state of the environment.


In co-creative interactions, ideas grow and evolve dynamically through time, with each contribution being made in relation to both the last turn of their partner as well as all contributions that came previously (Sawyer & DeZutter, 2009). Participants can fluidly take on different roles as they lead and follow their partner. Individuals can provoke their partner by making contributions that are contrary and different from what has been done (Davis et al., 2011). Conversely, participants can play along and follow their partner by adding onto their ideas, refining them, or even transforming them into something new (Karimi et al., 2019). The role of the participants and how they interact with each other are referred to as interaction dynamics (De Jaegher & Di Paolo, 2007), and they constitute the core of collaboration. Interaction dynamics include turn taking, communication strategies, feedback, coordination, and mode of interaction. All of these factors are critical for understanding and modeling co-creation. 

Enactive Co-Creation

The cognitive science theory of enaction (Stewart et al., 2010; Vernon, 2010) can be used as a lens through which to view the design and evaluation of co-creative AI systems (Davis et al., 2016). There are five core ideas to the cognitive science theory of enaction that a co-creative agent could achieve: 



Enaction focuses on how embodied interaction (Varela et al., 2017) with the world ground meaning structures by gradually detecting patterns of regularity through perception-action sensorimotor loops (Noë, 2004). This occurs in an interactive process called sense-making (Thompson & Stapleton, 2009; Colombetti, 2010). When multiple agents work together to make sense of a situation,  such as the process of co-creation, it is referred to as participatory sense-making (De Jaegher & Di Paolo, 2007; Fuchs & De Jaegher, 2009). This mutually influencing process of participatory sense-making is formally defined as:


"A co-regulated coupling between at least two autonomous agents, where: (i) the co-regulation and the coupling mutually affect each other, constituting an autonomous self- sustaining organization in the domain of relational dynamics and (ii) the autonomy of the agents involved is not destroyed (although its scope can be augmented or reduced)" (De Jaegher, & Di Paolo, 2007)


This quote demonstrates how meaning production in participatory sense-making is significantly influenced and structured by the manner of interacting together, here described as "relational dynamics, i.e. the rhythm of turn taking, ways of providing feedback, and manner of actions (independent from the content of actions)" (Davis et al., 2014). Co-regulated participatory sense-making occurs when agents "make sense of both these relational or interaction dynamics as well as the content of actions in a dual sense-making process that is unique to participatory sense-making and co-creation" (Davis et al., 2014). There may be clear examples of such co-regulation in human collaboration. The field of co-creative AI can strive to design and develop co-creative agents that exhibit the degree of autonomy and sense-making that would give rise to a similar co-regulated participatory sense-making as seen in human co-creation. "This type of interaction [PSM] is defined as the ideal for creative collaboration due to its ability to facilitate emergent meaning in an open ended interaction that goes beyond what the individual user could have accomplished alone" (Davis et al., 2014). 


The complex social interplay of co-creation involves a process of participatory sense-making whereby participants generate meaning through their interactions (De Jaegher & Di Paolo, 2007). This meaning is negotiated through social and interactional cues that guide the interaction  (De Jaegher & Di Paolo, 2007). The interaction can be thought of as a trajectory that flows through time with different events perturbing equilibria that form (Davis et al., 2015). Those points of interaction equilibrium are meaning structures that emerge in the co-creative interaction. In the language of dynamical systems, these would be called attractors and in co-creative interaction they form through a coupling of interaction where the subsequent interactions are based on the previous interactions. 


When sense-making is applied to creative context (e.g. producing a novel and valuable product), it can become creative sense-making (Davis et al., 2017). When multiple agents couple their actions during creative sense-making and build off each others contributions through time, it is referred to as enactive co-creation, with a descriptive definition as follows: 


"Cognitive agents, from the perspective of enaction, interact to gradually determine patterns of regularity and meaning using dynamic feedback loops (Fuchs & De Jaegher, 2009). Therefore, each collaborator is influenced both by the content of actions (i.e. lines of a drawing) as well as the dynamics of interaction that have emerged during in-the-moment interaction, such as the rhythm of turn taking, area of focus, and manner of motion. This dual feedback system has the potential to lead collaborators into new realms of creative expression with dynamic, multi-layered, and sometimes competing mental models for evaluating what creative contributions ‘make sense’ in the current situation." (Davis et al., 2016)


The participatory sense-making framework from the theory of enaction, as well as the dynamical systems perspective that enaction brings, can be useful in modeling and investigating co-creation due to its exclusive focus on interaction as the cornerstone of cognition. This paper combines the cognitive science theories of enaction with theories of collaborative improvisation to arrive at methods and an enactive model of co-creation that is suited for investigating co-creation. This method and model are useful for describing and modeling co-creation as they provide a vocabulary and technique for quantifying a number of critical variables in a co-creation context. 

3. Co-Creative AI Methodology

The co-creative sense-making (CCSM) framework is a data collection schema for co-creative systems to quantify the co-creation happening on the platform. There are four main categories in the schema: cognitive dynamics, interaction dynamics, collaboration dynamics, and domain behavior. Elements from each of these categories are recorded and visualized by the co-creative system to aid in the analysis of co-creation. The CCSM is shown in Figure X. 

Figure X: The co-creative sense-making framework.


Interaction Dynamics

Interaction coupling occurs when participants are building upon each other’s contributions by actively leveraging what their partner contributed. In drawing, this can mean drawing in a similar area of the canvas for more than one turn. When coupling occurs, it has a number of features that can be captured. We separate these features into four categories: what, how, when, and where, as shown in Figure X.

Figure X: Features of interaction couplings captured in the domain of drawing.


In co-creation, participant’s turns can be synchronous or asynchronous depending on the capabilities of the system as well as the preferences of those involved. This feature captures whether the participant and agent are both actively contributing to the creative product at the same time. The proximity feature captures whether the contributions were happening in a similar region of the creative product. For example, with drawing, proximity would correspond to the physical location of the drawn mark on the page. In poetry, proximity would be the location of the word or line relative to the whole poem. 

Feedback is a critical part of the co-creative process as it helps participants coordinate their actions and negotiate shared meaning structures (De Jagher, 2009). There are multiple channels in which feedback can occur, and it can be both implicit and explicit. For example, the AI Drawing Partner has voting buttons integrated into the system that enable the user to express explicit positive or negative feedback. There can also be implicit signals, such as the user pressing the undo button after the agent’s turn.

Cognitive Dynamics 

The cognitive modeling convention introduced by Davis et al. (2017) in the creative sense-making framework is used to record the cognitive dynamics in the CCSM. The creative sense-making framework is an initial step towards quantifying the interaction dynamics of co-creation using the cognitive science theory of enaction. It proposes cognitive states or cognitive modes that fluctuate through time to form sense-making processes in the creative process. There are three modes of cognition in the CSM framework: clamped, functional unclamp, and interactional unclamp. Clamped cognition means one knows what to do and is fluidly executing an on-task action. Functional unclamp means one is regulating interactions with the environment in some manner, while interactional unclamp means one is waiting, pausing or hesitating, perhaps to think and reflect. Each of these states is mutually exclusive, and the framework proposes that the agent must be in one of these three states at any given moment. Each cognitive state has behavioral markers that can be used to detect its presence during the coding process. 


Each cognitive state is given a value during the coding process. Clamped is -1, functional unclamp is 1, and interactional unclamp is 0. These three modes of cognition translate to four modes of interaction: communicate is a full functional unclamp coded as 1, manipulate interface is a partial functional unclamp coded as .5, wait is an interactional unclamp coded as 0, and execute actions is clamped coded as -1. The interaction mode of the agent is continuously coded through time to provide the raw interaction data from which we can determine different types of sense-making. These coded values are summed to produce a creative sense-making curve. To demonstrate the raw interaction dynamics and creative sense-making curve, a co-creative drawing session was conducted with the AI Drawing Partner () lasting ten minutes. The artwork produced during the session is shown in Figure X. The curves from the session are shown in Figure X.

 

Figure X: Artwork produced during the ten minute co-creative drawing session with the AI Drawing Partner. 

  

Figure X: Raw interaction dynamics (left) and creative sense-making curve (right) for a ten minute co-creative drawing session. 


The sense-making curve provides a visually readable linear representation of the cognitive fluctuations and interaction dynamics of a co-creative session through time. This curve can be annotated visually or analyzed with linear regression to provide the slope, intercept, and r-squared value. We employed an analytical method similar to stock market technical analysis, designed to identify 'buy', 'sell', and 'hold' trends. In our context, these correspond to 'regulate', 'execute', and 'wait', respectively. To conduct the analysis, the Moving Average Convergence Divergence (MACD) is used. Whenever the MACD crosses the signal line, the trend switches. A threshold is applied to the MACD line so it has to be a certain distance from the signal line to produce a trend, then all other signals are hold signals. This produces a dataset with a trend classification at each data point. These trends are summed and visualized, as shown in Figure X. 


Figure X: Interaction trend sequences for a ten minute co-creative drawing session. 


With this data, it is possible to compare the sense-making activities between participants and between sessions. The raw interaction mode code counts can be compared between conditions. The slope and features of the CSM curve can also be compared between conditions. The interaction trend sequence classification can be used to identify sequential patterns in the data.

Collaboration Dynamics

Counting the number of turns can help distinguish between sessions with longer turns, where participants may be expressing more full ideas, and shorter turns, where more improvisation and in-the-moment thinking may be occurring. Within a turn, a player may make an offer. An offer is a concept from improv theory that is the introduction of a new idea into the collaboration (Fuller & Magerko, 2011). This offer can be accepted or rejected. Accepting the offer means acknowledging the contribution and building upon it, while rejecting the offer entails ignoring the contribution and starting something new. If accepted, the other player can choose to elaborate their partner’s contribution, which is adding content to a previously established idea. Thus, there are four modes of collaboration defined in the CCSM: new idea, accept idea, reject idea, and elaboration. These collaboration modes are tracked in the CCSM to determine the dynamics of collaboration through time. 

Domain Behavior

There are two dimensions of the domain behavior that can be captured regardless of the exact creative domain: the amount of content produced, and the action history. For example, the AI  Drawing Partner records the amount of content produced per turn (e.g. number of lines and average length of line), which helps inform research into the nature of turn-taking. Second, the overall amount of content produced can be captured and recorded to compare between users with different amounts of content produced within similar timeframes. The AI Drawing Partner also records every action the user takes within the system along with the timestamp for that action. There are 18 total actions the user can engage in while interacting with the system (e.g. draw, fill, smudge, request sketch, request image, stylize, draw together). The action history can be used in two ways: 1) comparing actions between participants, and 2) defining the type of sense-making that was occurring in the CSM curve. Overlaying the action history data on top of the CSM curve will elucidate what participants were doing while they were engaged in sense-making. 

4. Enactive Model of Co-Creation (EMC2)

Co-creation occurs when two or more people collaborate on a shared creative product. In co-creation, meaning emerges dynamically as a result of the interaction of the participants involved. The open-ended and improvisational nature of co-creation makes it particularly difficult to study and quantify. While there are many related and relevant theories, none offer a means to quantify and study co-creation alone. Several frameworks were combined from the cognitive science paradigm of enaction with improv theory to arrive at an enactive model of co-creation that describes how meaning emerges through coupled interactions and offers techniques for quantifying different dimensions of those couplings. 


Davis et al. (2015) developed the enactive characterization of pretend play that looked at how meaning does not come pre-planned by players in a co-creation but rather emerges through the situation and depending on the actions of the other player. In this characterization, participants perform seed actions that can go on to form a nucleus activity that can be added onto through subsequent actions. Certain actions are so far from the nucleus activity (either conceptually or physically) they branch off into their own nucleus activity. All these nucleus activities then combine to form a narrative guiding the play. The players do not have a plan that is then updated and repaired based on what their partner does, as cognitivist views would propose, rather meaning emerges through the interaction that guides the play going forward.  


In the enactive characterization of pretend play, there are 5 stages of pretend play: prepare the mind, build meaning, enact the narrative, deepen the narrative, and maintain the flow. Preparing the mind involves getting into a mindset that is open to playing pretend, i.e. cultivating a playful mindset. Next, the players can physically arrange the space around them to support their activities and build meaning. With some meaningful structures in place, they can then enact a narrative, meaning take on the embodied role of a character and play in a space. It is here that certain actions become seed actions that form nucleus activities that can be added upon. Each of the nucleus activities has some meaning associated with it, and typically meaning is successively added upon whereby new contributions take into account what has been done before. As a result, a narrative begins to emerge that ties together the meaning of the different nucleus activities. This overarching narrative begins to maintain the flow of the interaction and suggest potential new activities to do. This characterization of pretend play can be helpful for understanding co-creation in general as it relates to the general structure of a co-creative interaction and the stages therein. 


Improvisation theory also introduces some useful concepts when investigating co-creation. In improv, participants make what are referred to as ‘offers’ through their actions that can then be either accepted or rejected by the other participants. Accepting an offer means using it and building upon it in one’s own turn. Rejecting an offer, on the other hand, refers to ignoring or completely changing what has been suggested by the improv partner. In improv theory, there is an emphasis on a practice referred to as yes-and, whereby the participants always attempt to play along with what has been said rather than rejecting it. In this way, the participants work to integrate the contribution of their partner and build upon what has been said. Each turn introduces new elements that change the direction of the improvisation, and meaning emerges organically through the interaction. 


By combining the improv concept of offers with participatory sense-making, creative sense-making, and the enactive characterization of pretend play, a model of co-creation is derived that is focused on interaction, but critically introduces elements to characterize and quantify that interaction more directly. The concept of offers is present in the EMC2 to help understand the circumstance under which a coupling occurred. The offer is the reason and motivation for a coupling. Considering either independently does not tell the whole story about interaction. If an offer is accepted, a coupling occurs and participatory sense-making begins. Here, we can examine the nature of that coupling, such as who initiated it, the turn taking structure, its length, the influence of each player, and who ended it. If extended beyond one turn, the coupling forms a nucleus activity that acts as an attractor to guide interactions by providing some meaning to structure actions. For example, in a drawing, the participants can find a face that prompts them to add elements to that face, such as eyes, ears, and a nose. 


Once a coupling takes place, cognition becomes ‘clamped’ to that meaning structure and works to interpret elements through the lens of that meaning as well as generating new actions related to that meaning. When participants are clamped, they achieve an interaction equilibrium where new ideas do not require a lot of energy to generate since there is an attractor, i.e. a nucleus activity, onto which they can easily add. The role of a nucleus activity is a dynamic attractor for thought that structures and guides the generation of new actions in a co-creation. The dynamism of the nucleus activity is what makes it particularly interesting in co-creation. It can evolve, shift, and transform based on the negotiations of the participants. 


Surprising stimuli from the environment can unclamp cognition and force the participant to engage in a process of sense-making to rectify what they expected with what they experienced in their environment. This can lead to two outcomes, 1) expanding the nucleus activity to accommodate the new addition, or 2) uncoupling from the nucleus activity that was formed. Expanding the nucleus activity would mean coming up with some meaning that connects what the participant just did with what was recently happening in the co-creation. Conversely, uncoupling signifies the end of a nucleus activity. If an uncoupling event occurs, one of the players can make another offer to begin the cycle of participatory sense-making again.


We can study clamped and unclamped cognition by looking at the fluidity with which new ideas are added to a coupled interaction and quantify the rhythm of contributions by each partner by analyzing the amount of time participants spent thinking during each turn. Those couplings with fluid interactions (i.e. very little hesitation) would be considered tightly coupled, while those with hesitation would be considered loosely coupled. A tightly coupled interaction could mean participants developed a strong attractor and robust nucleus activity that has a lot of opportunities for interaction, whereas a loosely coupled interaction could signify a weaker attractor and less developed nucleus activity. 


From successive nucleus activities, a narrative begins to emerge that ties together those nucleus activities and maintains the overall flow of the interaction. The narrative justifies why certain elements have been added to the co-creation and what could potentially come next. Often emerging through communication, the narrative is a type of retrospective rationalization about how the different elements of a co-creation relate to each other. If there is no relationship, participants can develop one to help contextualize the contribution and make a more coherent narrative. 

Figure X: The Enactive Model of Co-Creativity 


Figure X shows the enactive model of co-creativity and demonstrates how offers lead to coupled interactions and clamped cognition. In the Figure, each action is represented as the curve coming close to the timeline while waiting is represented by moving away from the timeline. The players begin by making offers to each other until one is accepted. During this time, cognition fluctuates from clamped to unclamped cognition during periods of sense-making where the players come up with ideas and express them. Once an offer is accepted, each turn is related to the previous turn of their partner and a coupling occurs. A nucleus activity emerges onto which cognition can clamp to generate actions more fluidly. This nucleus activity serves as an attractor to create an interaction equilibrium during which it is easier to generate actions. Then, when player 1 makes an offer during the coupled interaction, it decouples the interaction and the players go back to making individual offers on their own. These actions are unrelated to what their partner is doing. Finally, an offer is accepted and a new coupling begins. 


To demonstrate the EMC2, a model was produced of the ten minute co-creative drawing session presented in Section X. The EMC2 is visually depicted in Figure X. To produce the EMC2 curve, the coded values are transformed so the user is on top of the x-axis and the agent is below the x-axis. Waiting for the user is coded as 1, communication and interface manipulation are coded as .5, and executing actions is coded as 0. Waiting for the agent is coded as -1, while communication is coded as .5, and executing actions is coded as 0. 


Figure X: The enactive model of co-creativity for a ten-minute co-creative drawing session. The EMC2 curve is on top, and the raw cognitive dynamics are on bottom. 


This model of co-creation focuses on the interaction of the participants and the meaning that emerges as a result of that interaction. The concepts presented provide an avenue for quantifying the interaction dynamics of co-creation in a way that was not possible before. In particular, it was useful to combine the concepts of offers with that of coupling so there was a method for determining why a coupling occurred in a collaboration. It is also critical to understand the features of that coupling in a way that can be quantified and compared between couplings. The enactive literature describes how couplings occur and what they are, but they do not offer a technique for quantifying them in order to understand open-ended interaction. Nucleus activities introduce a way to understand how meaning emerges through one small seed action and grows through subsequent turns. Clamped and unclamped cognition begin to delve into the cognitive state of the participants and how this cognitive state can influence the interpretation and generation of actions. Finally, the concept of a narrative is stronger in some creative domains, and sometimes may not even emerge, but it can be a driving force of a co-creation depending on the communicative circumstances of the players involved as well as the type of domain they are working in. The enactive model of co-creation is domain independent and applicable to many co-creative domains, such as art, pretend play, music, dance, and improv. 

5. Discussion

Participatory sense-making provides a vocabulary and conceptual foundation upon which to describe the social elements of co-creation, such as interaction dynamics, negotiating shared meaning, coordinating interactions, and forming feedback loops. PSM uses concepts from dynamical systems to describe structurally coupled interactions, i.e. those interactions that share some similarity between successive turns. PSM occurs in co-creation when participants regulate both their experience with the shared product and the social element of interacting with the collaborator. 


The CCSM quantifies elements from participatory sense-making. It captures data about the user’s cognition, interaction, collaboration, and domain behaviors. The cognitive data corresponds to the clamped/unclamped state of cognition as it relates to modes of interaction (e.g. communicate, manipulate interface, wait, execute action). The interaction data includes data about interaction couplings, communication strategies, feedback, and turn-taking. The interaction couplings are segmented into four categories: what, when, where, and how, with several features defining each category. The collaboration data deals with improvisational offers made and whether they were accepted or rejected. Domain behaviors are the unique actions afforded by each creative domain. Together, this data quantifies the co-creative process. 


The CCSM can be used in conjunction with a survey instrument and semi-structured interview questions to ascertain details about the user’s qualitative experience. The mixed-initiative creativity support index (MICSI) (Lawton et al., 2023) can be paired with the CCSM to assess the relative creativity support as well as dimensions of human-AI collaboration. 


The enactive model of co-creation is a method to visually illustrate a co-creative interaction as it progresses through time. It depicts offers and whether they were accepted. If accepted, an interaction coupling begins and a nucleus activity gets formed. That nucleus activity acts as an attractor, structuring the turns that are part of the interaction coupling. The model also depicts the state of cognition the user is going through as they interact. This modeling convention can be encoded into the co-creative AI system, similar to the CCSM. The data from the CCSM can be used to populate the visualizations. The CCSM captures when an offer was made, as well as whether it was accepted. It also records the cognitive mode of the individual. If a co-creative system utilizes the CCSM, it would have the resources to visualize the EMC2. This type of visualization would help with the explainability of co-creative systems (Ehsan & Riedl, 2020). 

6. Conclusions

This paper presented a unified theory, method, and model of co-creative AI. An enactive theory of co-creative AI was presented, highlighting participatory sense-making and interaction dynamics as core features of co-creation. The theory emphasized how co-creation is open-ended, dynamic, improvisational, and emergent. Meaning is negotiated and built dynamically through interactions in co-creation. Enactive co-creation was defined, where participants achieve participatory sense-making by co-regulating both the content of their actions and the social interactions with their collaborator. Co-creative sense-making was presented, which is a method for quantifying co-creation during the co-creative process. It has four main categories in the data collection schema: cognitive dynamics, interaction dynamics, collaboration dynamics, and domain behaviors. Elements from each of these categories are quantified and visualized in the CCSM approach. The enactive model of co-creativity was presented, which describes how interaction couplings produce nucleus activities that serve as attractors to guide, facilitate, and structure interactions. A visual modeling convention for the enactive model of co-creativity was presented that visualizes turn-taking, improv offers, interaction coupling, and cognitive states through time. Together, the theory, method, and model offer a cohesive framework for conducting co-creative AI research. 




References

Boden, M. (1994). Creativity and computers. In Artificial intelligence and creativity (pp. 3-26). Springer, Dordrecht.

Candy, L., & Edmonds, E. (2002, October). Modeling co- creativity in art and technology. In Proceedings of the 4th conference on Creativity & cognition, 134-141. AC

Carnovalini, F., & Rodà, A. (2020). Computational creativity and music generation systems: An introduction to the state of the art. Frontiers in Artificial Intelligence, 3, 14.

Colton, S., Halskov, J., Ventura, D., Gouldstone, I., Cook, M., & Ferrer, B. P. (2015, June). The Painting Fool Sees! New Projects with the Automated Painter. In ICCC (pp. 189-196).

Colton, S., & Wiggins, G. A. (2012, August). Computational creativity: The final frontier?. In Ecai (Vol. 12, pp. 21-26).

Colombetti, G. (2010). Enaction, sense-making and emotion. Enaction: Toward a new paradigm for cognitive science, 145-164.

De Jaegher, H., & Di Paolo, E. (2007). Participatory sense-making. Phenomenology and the cognitive sciences, 6(4), 485-507.

Davis, N. M., Popova, Y., Sysoev, I., Hsiao, C. P., Zhang, D., & Magerko, B. (2014). Building Artistic Computer Colleagues with an Enactive Model of Creativity. In ICCC (pp. 38-45).

Davis, N., Hsiao, C. P., Yashraj Singh, K., Li, L., & Magerko, B. (2016, March). Empirically studying participatory sense-making in abstract drawing with a co-creative cognitive agent. In Proceedings of the 21st International Conference on Intelligent User Interfaces (pp. 196-207).

Ehsan, U., & Riedl, M. O. (2020, July). Human-centered explainable ai: Towards a reflective sociotechnical approach. In International Conference on Human-Computer Interaction(pp. 449-466). Springer, Cham.

Fischer, G.; Giaccardi, E.; Eden, H.; Sugimoto, M.; and Ye, Y. 2005. Beyond binary choices: Integrating individual and social creativity. International Journal of Human-Computer Studies 63(4):482–512.

Fuchs, T., & De Jaegher, H. (2009). Enactive intersubjectivity: Participatory sense-making and mutual incorporation. Phenomenology and the cognitive sciences, 8(4), 465-486.

Freeman, J., Magerko, B., McKlin, T., Reilly, M., Permar, J., Summers, C., & Fruchter, E. (2014, March). Engaging underrepresented groups in high school introductory computing through computational remixing with EarSketch. In Proceedings of the 45th ACM technical symposium on Computer science education (pp. 85-90).

Frich, J., MacDonald Vermeulen, L., Remy, C., Biskjaer, M. M., & Dalsgaard, P. (2019, May). Mapping the landscape of creativity support tools in HCI. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (pp. 1-18).

Grace, K., Lou, M., Mohseni, M., & Perez Y Perez (2017). Encouraging p-creative behaviour with computational curiosity. ICCC.

Gross, M. D. (1996). The electronic cocktail napkin—a computational environment for working with design diagrams. Design studies, 17(1), 53-69.

Gross, M. D., & Do, E. Y. L. (1996, April). Demonstrating the Electronic Cocktail Napkin: a paper-like interface for early design. In Conference Companion on Human Factors in Computing Systems (pp. 5-6).

Jordanous, A. (2016). Four PPPPerspectives on computational creativity in theory and in practice. Connection Science, 28(2), 194-216.

Johnson, G., Gross, M. D., Hong, J., & Do, E. Y. L. (2009). Computational support for sketching in design: a review. Foundations and Trends® in Human–Computer Interaction, 2(1), 1-93.

Kantosalo, A., Toivonen, J. M., Xiao, P., & Toivonen, H. (2014, June). From Isolation to Involvement: Adapting Machine Creativity Software to Support Human-Computer Co-Creation. In ICCC (pp. 1-7).

Kantosalo, A., & Toivonen, H. (2016, June). Modes for creative human-computer collaboration: Alternating and task-divided co-creativity. In Proceedings of the seventh international conference on computational creativity (pp. 77-84).

Kantosalo, A., & Takala, T. (2020, September). Five C's for Human-Computer Co-Creativity-An Update on Classical Creativity Perspectives. In ICCC (pp. 17-24).

Long, D., & Magerko, B. (2020, April). What is AI literacy? Competencies and design considerations. In Proceedings of the 2020 CHI conference on human factors in computing systems (pp. 1-16).

Lubart, T. 2005. How can computers be partners in the creative process: classification and commentary on the special issue. International Journal of Human-Computer Studies, 63(4), 365-369.

Maher, M. L. (2012, May). Computational and collective creativity: Who's being creative?. In ICCC (pp. 67-71).

Nakakoji, K. (2006, November). Meanings of tools, support, and uses for creative design processes. In International design research symposium (Vol. 6, pp. 156-165).

Noë, A.,(2004). Action in perception. MIT press.

Shneiderman, B. (2007). Creativity support tools: accelerating discovery and innovation. Communications of the ACM, 50(12).

Shneiderman, B., Fischer, G., Czerwinski, M., Resnick, M., Myers, B., Candy, L., ... & Terry, M. (2006). Creativity support tools: Report from a US National Science Foundation sponsored workshop. International Journal of Human-Computer Interaction, 20(2), 61-77.

Sternberg, R. J., & Lubart, T. I. (1999). The concept of creativity: Prospects and paradigms. Handbook of creativity, 1(3-15).

Stewart, J., Stewart, J. R., Gapenne, O., & Di Paolo, E. A. (Eds.). (2010). Enaction: Toward a new paradigm for cognitive science. MIT press.

Thompson, E., & Stapleton, M. (2009). Making sense of sense-making: Reflections on enactive and extended mind theories. Topoi, 28(1), 23-30.

Varela, F. J., Thompson, E., & Rosch, E. (2017). The embodied mind, revised edition: Cognitive science and human experience. MIT press.

Vernon, D. (2010). Enaction as a conceptual framework for developmental cognitive robotics. Paladyn, 1(2), 89-98.

Weinberg, G., Raman, A., & Mallikarjuna, T. (2009, March). Interactive jamming with Shimon: a social robotic musician. In Proceedings of the 4th ACM/IEEE international conference on Human robot interaction (pp. 233-234).

Yannakakis, G. N.; Liapis, A.; and Alexopoulos, C. 2014. Mixed-initiative co-creativity. In Proceedings of the ACM Conference on Foundations of Digital Games.






Bown, O., Grace, K., Bray, L., & Ventura, D. (2020). A Speculative Exploration of the Role of Dialogue in Human-ComputerCo-creation. In ICCC (pp. 25-32).

Sawyer, R. K., & DeZutter, S. (2009). Distributed creativity: How collective creations emerge from collaboration. Psychology of aesthetics, creativity, and the arts, 3(2), 81.

Davis, N., Do, E. Y. L., Gupta, P., & Gupta, S. (2011, November). Computing harmony with PerLogicArt: perceptual logic inspired collaborative art. In Proceedings of the 8th ACM conference on Creativity and cognition (pp. 185-194).

Karimi, P., Maher, M. L., Davis, N., & Grace, K. (2019). Deep learning in a computational model for conceptual shifts in a co-creative design system. arXiv preprint arXiv:1906.10188.

Davis, N., Comerford, M., Jacob, M., Hsiao, C. P., & Magerko, B. (2015, June). An enactive characterization of pretend play. In Proceedings of the 2015 ACM SIGCHI Conference on Creativity and Cognition (pp. 275-284).

—-

Abdellahi, S., Maher, M. L., & Siddiqui, S. (2020). Arny: A Co-Creative System Design based on Emotional Feedback. In ICCC (pp. 81-84).


Bang, Y., Cahyawijaya, S., Lee, N., Dai, W., Su, D., Wilie, B., ... & Fung, P. (2023). A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. arXiv preprint arXiv:2302.04023.


Biles, J. A. (2002). GenJam: Evolution of a jazz improviser. In Creative evolutionary systems (pp. 165-187). Morgan Kaufmann.


Bryan-Kinns, N., Ford, C., Chamberlain, A., Benford, S. D., Kennedy, H., Li, Z., ... & Rezwana, J. (2023, June). Explainable AI for the Arts: XAIxArts. In Proceedings of the 15th Conference on Creativity and Cognition (pp. 1-7).


Chen, M., Tworek, J., Jun, H., Yuan, Q., Pinto, H. P. D. O., Kaplan, J., ... & Zaremba, W. (2021). Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374.


Clark, A. (1999). An embodied cognitive science?. Trends in cognitive sciences, 3(9), 345-351.


Davis, N. (2013). Human-computer co-creativity: Blending human and computational creativity. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (Vol. 9, No. 6, pp. 9-12).


Davis, N., Hsiao, C. P., Popova, Y., & Magerko, B. (2015). An enactive model of creativity for computational collaboration and co-creation. Creativity in the digital age, 109-133.


Davis, N., Hsiao, C. P., Singh, K. Y., Lin, B., & Magerko, B. (2017, June). Creative sense-making: Quantifying interaction dynamics in co-creation. In Proceedings of the 2017 ACM SIGCHI Conference on Creativity and Cognition (pp. 356-366).


Deshpande, M., & Magerko, B. (2021). Drawcto: A Multi-Agent Co-Creative AI for Collaborative Non-Representational Art.


De Jaegher, H. and Di Paolo, E. 2007. Participatory sense-making. Phenomenology and the cognitive sciences 6, 4 (2007), 485–507. Publisher: Springer.


Deshpande, M., Trajkova, M., Knowlton, A., & Magerko, B. (2023, June). Observable Creative Sense-Making (OCSM): A Method For Quantifying Improvisational Co-Creative Interaction. In Proceedings of the 15th Conference on Creativity and Cognition (pp. 103-115).


Di Paolo, E., Buhrmann, T., & Barandiaran, X. (2017). Sensorimotor life: An enactive proposal. Oxford University Press.


Fan, J. E., Dinculescu, M., & Ha, D. (2019). Collabdraw: an environment for collaborative sketching with an artificial agent. In Proceedings of the 2019 on Creativity and Cognition (pp. 556-561).


Gibson, J. J. (2014). The ecological approach to visual perception: classic edition. Psychology press.


Guzdial, M., Liao, N., & Riedl, M. (2018). Co-creative level design via machine learning. arXiv preprint arXiv:1809.09420.


Ha, D., & Eck, D. (2017). A neural representation of sketch drawings. arXiv preprint arXiv:1704.03477.


Hoffman, G., & Weinberg, G. (2010). Shimon: an interactive improvisational robotic marimba player. In CHI'10 Extended Abstracts on Human Factors in Computing Systems (pp. 3097-3102).


Huang, X., & Belongie, S. (2017). Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE international conference on computer vision (pp. 1501-1510).


Huang, C. Z. A., Koops, H. V., Newton-Rex, E., Dinculescu, M., & Cai, C. J. (2020). AI song contest: Human-AI co-creation in songwriting. arXiv preprint arXiv:2010.05388.


Hutchins, E. (2000). Distributed cognition. International Encyclopedia of the Social and Behavioral Sciences. Elsevier Science, 138, 1-10.


Ibarrola, F., Lawton, T., & Grace, K. (2022). A Collaborative, Interactive and Context-Aware Drawing Agent for Co-Creative Design. arXiv preprint arXiv:2209.12588.


Jansen, C., & Sklar, E. (2021, September). Predicting Artist Drawing Activity via Multi-camera Inputs for Co-creative Drawing. In Annual Conference Towards Autonomous Robotic Systems (pp. 217-227). Cham: Springer International Publishing.


Kantosalo, A., & Riihiaho, S. (2019). Usability testing and feedback collection in a school context: Case poetry machine. Ergonomics in Design, 27(3), 17-23.


Kantosalo, A.; Toivanen, J. M.; Xiao, P.; and Toivonen, H. 2014. From isolation to involvement: Adapting ma- chine creativity software to support human-computer co- creation. In Proceedings of the Fifth International Con- ference on Computational Creativity, Ljubljana, Slovenia, 1–7.


Kantosalo, A., & Takala, T. (2020, September). Five C's for Human-Computer Co-Creativity-An Update on Classical Creativity Perspectives. In ICCC (pp. 17-24).


Karimi, P., Maher, M. L., Davis, N., & Grace, K. (2019). Deep learning in a computational model for conceptual shifts in a co-creative design system. arXiv preprint arXiv:1906.10188.


Kasneci, E., Seßler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., ... & Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103, 102274.


Kellas, J. K., & Trees, A. R. (2014). Rating interactional sense-making in the process of joint storytelling. In The Sourcebook of Nonverbal Measures (pp. 281-294). Psychology Press.


Kim, J., & Maher, M. L. (2021). Evaluating the Effect of Co-Creative Systems on Design Ideation. In ICCC (pp. 440-443).


Kupers, E. Van Dijk, M. and Lehmann-Wermser, A. 2018. Creativity in the here and now: A generic, micro-developmental measure of creativity Frontiers in psychology 9 (2018), 2095. Publisher: Frontiers Media SA.


Lin, Y., Guo, J., Chen, Y., Yao, C., & Ying, F. (2020, April). It is your turn: Collaborative ideation with a co-creative robot through sketch. In Proceedings of the 2020 CHI conference on human factors in computing systems (pp. 1-14).


Lin, Z., Ehsan, U., Agarwal, R., Dani, S., Vashishth, V., & Riedl, M. (2023). Beyond Prompts: Exploring the Design Space of Mixed-Initiative Co-Creativity Systems. Accepted to ICCC’23.


Long, D., Jacob, M., & Magerko, B. (2019). Designing co-creative AI for public spaces. In Proceedings of the 2019 on Creativity and Cognition (pp. 271-284).


Long, D., & Magerko, B. (2020, April). What is AI literacy? Competencies and design considerations. In Proceedings of the 2020 CHI conference on human factors in computing systems (pp. 1-16).


Margarido, S., Machado, P., Roque, L., & Martins, P. (2022, August). Let’s Make Games Together: Explainability in Mixed-initiative Co-creative Game Design. In 2022 IEEE Conference on Games (CoG) (pp. 638-645). IEEE.


Noë, A. (2004). Action in perception. MIT press.


Norman DA (1999) Affordance, conventions, and design. Interactions 6(3):38–43 


Norman, D. (2013). The design of everyday things: Revised and expanded edition. Basic books.


Oh, C., Song, J., Choi, J., Kim, S., Lee, S., & Suh, B. (2018, April). I lead, you help but only with enough details: Understanding user experience of co-creation with artificial intelligence. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (pp. 1-13).


O'regan, J. K., & Noë, A. (2001). A sensorimotor account of vision and visual consciousness. Behavioral and brain sciences, 24(5), 939-973.


Rezwana, J., & Maher, M. L. (2022). Designing creative AI partners with COFI: A framework for modeling interaction in human-AI co-creative systems. ACM Transactions on Computer-Human Interaction.


Robbins, P., & Aydede, M. (Eds.). (2008). The Cambridge handbook of situated cognition. Cambridge University Press.


Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10684-10695).

Slayton, S. C., D'Archer, J., & Kaplan, F. (2010). Outcome studies on the efficacy of art therapy: A review of findings. Art therapy, 27(3), 108-118.


Suchman, L. A. (2007). Human-machine reconfigurations: Plans and situated actions. Cambridge university press.


Trajkova, M., Deshpande, M., Knowlton, A., Monden, C., Long, D., & Magerko, B. (2023, July). AI Meets Holographic Pepper’s Ghost: A Co-Creative Public Dance Experience. In Designing Interactive Systems Conference (pp. 274-278).


Varela, F. J., Thompson, E., & Rosch, E. (2017). The embodied mind, revised edition: Cognitive science and human experience. MIT press.


Xu, F., Uszkoreit, H., Du, Y., Fan, W., Zhao, D., & Zhu, J. (2019). Explainable AI: A brief survey on history, research areas, approaches and challenges. In Natural Language Processing and Chinese Computing: 8th CCF International Conference, NLPCC 2019, Dunhuang, China, October 9–14, 2019, Proceedings, Part II 8 (pp. 563-574). Springer International Publishing.


Yannakakis, G. N.; Liapis, A.; and Alexopoulos, C. 2014. Mixed-initiative co-creativity. In Proceedings of the 9th International Conference on the Foundations of Digital Games, FDG 2014.


Yuan, A., Coenen, A., Reif, E., & Ippolito, D. (2022, March). Wordcraft: story writing with large language models. In 27th International Conference on Intelligent User Interfaces (pp. 841-852).


Zhang, C., Yao, C., Liu, J., Zhou, Z., Zhang, W., Liu, L., ... & Wang, G. (2021, May). StoryDrawer: A co-creative agent supporting children's storytelling through collaborative drawing. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1-6).


Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., ... & Wen, J. R. (2023). A survey of large language models. arXiv preprint arXiv:2303.18223.


Zhu, J., Liapis, A., Risi, S., Bidarra, R., & Youngblood, G. M. (2018, August). Explainable AI for designers: A human-centered perspective on mixed-initiative co-creation. In 2018 IEEE Conference on Computational Intelligence and Games (CIG) (pp. 1-8). IEEE.