Co-Creative AI

A New Landscape in HCI & AI

Co-Creative AI Paper Google Doc

***Collaborators Are Welcome***

Introduction

Creativity is typically defined as a product or process that is novel and valuable (Sternberg & Lubart, 1999). Creativity has been interwoven in technological developments since the early days of computing (Boden, 1994). More recently, creativity support tools (CSTs) have become effective at helping people rapidly explore ideas, visualize multiple solutions, and perform complex simulations (Shneiderman 2007). Yet, many creativity support tools are not able to create original artistic contributions, such as adding a new line, poem stanza, or musical accompaniment to a shared artwork (Davis et al., 2014). Advances in AI, computational creativity, and robotics are enabling the development of co-creative AI agents, i.e. computer colleagues (Lubart 2005), that can collaborate with people in a more naturalistic and improvisational manner. A working definition of co-creativity could start as follows:

"Co-creativity is classified as multiple parties contributing to the creative process in a blended manner (Candy et al. 2002)....Cooperation, on the other hand can be modeled as a distribution of labor where the result only represents the sum of each individual contribution (Candy et al. 2002). Co-creativity allows participants to improvise based on decisions of their peers. Ideas can be fused, and built upon in ways that stem from the unique mix of personalities and motivations of the team members (Candy et al. 2002). Here, the creative product emerges through interaction and negotiation between multiple parties, and the sum is greater than individual contribution[s]. These interaction principles can be extended to include a sufficiently creative agent that can collaborate with human users in a new kind of human-computer creativity" (Davis et al., 2014 emphasis added).

Co-Creative Artificial Intelligence is an interdisciplinary field of AI that brings together HCI, AI, Machine Learning, Computational Creativity, Creativity Research, Collaboration Research, and Cognitive Science to enable the design, implementation, and evaluation of computationally creative systems that collaborate with users on shared creative products (Davis, 2014). This field bridges the academic domains of creativity support tools (CST) that accelerate the user's existing creativity (Gross, 1996; Gross & Do, 1996; Nakakoji, 2006; Shneiderman et al., 2006; Shneiderman, 2007; Johnson et al., 2009; Frick et al., 2019), and computational creativity systems that generate new creative products autonomously or facilitate human creativity (Colton & Wiggins, 2012; Maher, 2012; Colton et al., 2015; Jordanous, 2016; Grace et al., 2017; Carnovalini, 2020). The resulting systems are either a nanny, pen-pal, coach or computer colleague that act as partners in a co-creative workflow (Lubart, 2005).

(Davis et al., 2014)

Kantosalo & Toivonen (2014) offer a formal definition of co-creation that is strongly rooted in the literature. It is qouted below:

"We use the term human-computer co-creation to refer to collaborative creativity where both the human and the computer take creative responsibility for the generation of a creative artefact. The term co-creation refers here to a social creativity process ”leading to the emergence and sharing of creative activities and meaning in a socio-technical environment” (Fischer et al. 2005), but with the emphasis that the computer is, instead of only providing the socio-technical environment, also an active participant in the creative activities. This is similar to the definition of mixed-initiative co-creativity (MI-CC) by Yannakis et al. (2014), who define it as the creation of artefacts with the interaction of a human and a computational initiative. They note that the two participants do not need to contribute to the same degree, and we do not demand symmetric contributions from human-computer co-creative systems neither." (Kantosalo & Toivonen, 2014)

Kantosalo & Takala (2020) elaborate the earlier definition of co-creation to include the creative collective, and offer the following definition:

"The creative human–computer collective consists of at least one human and one computational collaborator. The collaboration of the collective consists of individual and collaborative creative processes and interactions that support them. The collaboration results in an artefact or a product that represents the contributions of the collective. These contributions are communicated to and shared with a wider community of peers, audiences, and other social influences. The co-creative collaboration takes place in a context representing the environment of the creative act, including e.g. cultural artefacts and conventions, and more immediate factors such as material affordances and shared mental resources, such as the creative task." (Kantosalo & Takala, 2020)

Enactive Co-Creation

The cognitive science theory of enaction (Stewart et al., 2010; Vernon, 2010) can be used as a lens through which to view the design and evaluation of co-creative AI systems (Davis et al., 2016). There are five core ideas to the cognitive science theory of enaction that demonstrate what a co-creatve agent could cognitively achieve:

Autonomy: "How cognitive agents operate independently, based on their own intrinsic laws, to sustain themselves and continually generate their identity through interaction with the environment according to those laws. A system is defined as ‘autonomous’ when it can change the laws that govern its interaction with the world by casting a web of significance on the elements in its environment." (Davis et al., 2016)
Sense-making: "As autonomous agents interact with their environment, they gradually detect patterns of regularities from dynamic feedback loops, which help them make sense of the environment in a process referred to as sense-making. " (Davis et al., 2016)
Emergence: "Enaction describes the emergence of meaning structures through coordination and coupling between a cognitive agent and some system in the environment., i.e. the actions the agent performs are tightly intertwined with predictable effects from the environment" (Davis et al., 2016)
Embodiment: "Agents must act with their body to make sense of the environment, which inherently constrain and affords certain types of interaction." (Davis et al., 2016)
Experience: "The agent is shaped by the meaning it has imbued into its environment and this influences how future actions are generated and evaluated." (Davis et al., 2016)

Enaction focuses on how embodied interaction (Varela et al., 2017) with the world grounds meaning structures by gradually detecting patterns of regularity through perception-action sensorimotor loops (Noë, 2004). This occurs in an interactive process called sense-making (Thompson & Stapleton, 2009; Colombetti, 2010). When multiple agents work together to make sense of a situation, such as the process of co-creation, it is referred to as participatory sense-making (De Jaegher & Di Paolo, 2007; Fuchs & De Jaegher, 2009). When sense-making is applied to a novel and valuable goal, it can become creative sense-making, or what we know of as creativity or collaboration. When multiple agents couple their actions during creative sense-making and build off each others contributions through time, it is referred to as enactive co-creation, defined as follows:

"Cognitive agents, from the perspective of enaction, interact to gradually determine patterns of regularity and meaning using dynamic feedback loops (Fuchs & De Jaegher, 2009). Therefore, each collaborator is influenced both by the content of actions (i.e. lines of a drawing) as well as the dynamics of interaction that have emerged during in-the-moment interaction, such as the rhythm of turn taking, area of focus, and manner of motion. This dual feedback system has the potential to lead collaborators into new realms of creative expression with dynamic, multi-layered, and sometimes competing mental models for evaluating what creative contributions ‘make sense’ in the current situation." (Davis et al., 2016)

A cognitive agent can be said to exhibit a "degree of structural coupling with a system in the environment when it has understood the parameters and mechanisms of that system to the extent that it can predict how different actions will affect the system (Davis et al., 2014). Adding additional agents becomes more complex because each agent is engaging in a process of sense-making with the environment as well as each other. This mutually influencing process is referred to as participatory sense-making. It is formally defined as:

"A co-regulated coupling between at least two autonomous agents, where: (i) the co-regulation and the coupling mutually affect each other, constituting an autonomous self- sustaining organization in the domain of relational dynamics and (ii) the autonomy of the agents involved is not destroyed (although its scope can be augmented or reduced)" (De Jaegher, & Di Paolo, 2007)

This quote demonstrates how meaning production in participatory sense-making is significantly influenced and structured by the manner of interacting together, here described as "relational dynamics, i.e. the rhythm of turn taking, ways of providing feedback, and manner of actions (independent from the content of actions)." (Davis et al., 2014)

Co-regulated participatory sense-making when agents "make sense of both these relational or interaction dynamics as well as the content of actions in a dual sense-making process that is unique to participatory sense-making and co-creation" (Davis et al., 2014). There may be clear examples of such co-regulation in human collaboration. The field of co-creative AI can strive to design and develop co-creative agents that exhibit the degree of autonomy and sense-making that would give rise to a similar co-regulated participatory sense-making as seen in human co-creation (Davis et al., 2014). "This type of interaction is defined as the ideal for creative collaboration due to its ability to facilitate emergent meaning in an open ended interaction that goes beyond what the individual user could have accomplished alone" (Davis et al., 2014).

Interaction Dynamics of Co-Creation

Kantosalo & Toivonen (2016) have introduced modes of co-creativity describing different types of interactions that can occur in a co-creative scenario. These modes are quoted below:

"We focus on cases where one human and one computational agent collaborate in co-creation, as this is currently the most common case presented in literature. We define alternat- ing co-creativity as co-creativity in which the co-creative partners take turns in creating a new concept satisfying the requirements of both parties. As a sister term, we define task-divided co-creativity as co-creativity in which the co- creative partners take specific roles within the co-creative process, producing new concepts satisfying the requirements of one party. We focus on the first which we consider more interesting as it puts the human and the computational agent in a more equal position." (Kantosalo & Toivonen, 2016)

Co-creation is more likely to occur when all participants feel confident in their ability to contribute and add value and meaning to the shared activity. Defining generalized conceptual structures to help facilitate co-creation is therefore a critically important activity to help develop a formal theory of co-creation and facilitate effective co-creation. Instituting gradually increasing line limits in the domain of collaborative drawing is one example of a heuristic and conceptual foundation upon which fruitful co-creation can emerge,though the generalizability of this approach is still unknown. To develop a general theory of co-creation it is necessary to tease apart the constituent elements of co-creative interaction dynamics and examine how they function together as a holistic system of social cognition. The initial taxonomy of interaction dynamics below describes the manner in which people engage to make sense together through time.

Turn Length

Turn length corresponds to how long a participant (or all participants) spend on each turn (turn being defined as taking a pause over a certain temporal threshold).

Rhythm of Turn

How rapidly each contribution was made in a turn, i.e. many small rapid contributions versus a few slow contributions. Turn rhythm has two major axes of variance, the amount of content produced in a turn and the application speed of that content.

Turn-Taking Style

The style of turn-taking corresponds to how turn taking is structured at a given point in the collaboration. A collaborative session may include many styles of turn-taking throughout its course, but there are two main categories that define the spectrum of turn-taking style.

Delineated turns: one participant contributes as a time
Synchronous turns: participants choose when to begin an end a turn regardless of the activities of others

Roles

The roles a co-creative participant can take are defined by the types of actions and activities in which the participant is engaged. While roles are fluid during improvised collaboration, some archetypal roles may emerge throughout the process of collaboration.

Facilitator: Coordinates the interaction dynamics of the collaboration. For example, setting ground rules, explaining process, shifting phases throughout the collaboration (e.g. changing turn taking styles).
Visionary: Seeing potential themes, narratives, objects and other meaningful patterns the creative product could become.
Framer: Forging new foundations upon which later contributions can meaningfully build.
Builder: Adding onto the foundations to make them into well established and robust themes.
Designer: Defining the narrative theme and flair of the different meaningful structures emerging in the creative produce.
Detailer: Adding flourishes and details to existing structures to realize the designer’s vision in more detail.
Finisher: Tying the various meaningful patterns throughout the creative product together into a coherent whole that aligns (and potentially updates) the vision for the creative product.

Facilitation Types

Co-regulated

All participants contribute to defining and maintaining the creative trajectory of the creative product. Leadership structure is fluid based on the current interests and desires of the participants and their ability to communicate and persuade other collaborators of the value of their ideas.

Facilitated

One person (or a group of people) guide the creative trajectory roughly down a path they think will be harmonious, while still considering the feedback of all participants. Typically, the most experienced collaborator may assume the role of facilitator as they have experience in this domain. For example, a parent often facilitates co-creation with their child by suggesting different activities while including the ideas of the child in the overall trajectory of the collaboration.

Hierarchical

Participants at different levels of ownership and influence determine what will be added by whom throughout the drawing. If no expert collaborator intervenes as a facilitator, a natural hierarchy may emerge based on the skill and social dynamics of group members.

Materials and Resources

The types of creative tools available to the participants during the collaboration afford certain types of contributions and constrain the problem space. For example, a collaboration might begin with lead pencils, then advance to pen, then sharpie, then colored pencil. Each time a new resource is introduced in the collaboration, new potentials for creative expression exist. The facilitator determines when to introduce various physical and conceptual resources (e.g. turn taking styles) in order to facilitate an effective co-creation process.

Spatio-Temporal Distribution

Collaborations can occur in different spatio-temporal contexts depending on the needs and requirements of the project and resources available to the collaborators.

Location

Co-located collaboration: collaborators can be co-located and work on the same physical (or virtual) artifact
Distance collaboration: collaborators can work on the same physical or virtual artifact in different places by transporting the artifact between locations (either physically or virtually).

Timing

Synchronous: collaborators can be co-present in the same moment and actively look at the contributions that were made by collaborators and respond to them in real time.
Asynchronous: collaborators can work concurrently.

Computational Co-Creativity

While it is a critical component of human social cognition, co-creation presents some tough computational challenges that require interdisciplinary solutions, such as:

Improvisational: In addition to knowing what type of action is appropriate, co-creative agents must be able to coordinate in real time to be an effective partner, requiring knowledge about how different types of contributions would affect their partner given prior experience.
Dynamic: Co-creation typically occurs in a real-time context that includes continuously changing ideas and input. A co-creative agent needs to be able to dynamically filter and prioritize relevant input given expectations based on experience.
Open-ended: Co-creation does not require an end goal (e.g. a conversation does not have an ideal end state), and can introduce knowledge from virtually any domain that either participant desires, which requires robust knowledge or the ability to rapidly acquire new knowledge through interaction.
Emergent: When multiple agents generate contributions to a shared artifact, the product naturally grows in unexpected and happenstance ways that often involve conceptual shifts and re-interpretations. This situated awareness requires dynamically changing models of the environment. However, it is difficult to determine when and why to change this type of knowledge in a co-creative agent because there is no right answer, simply a prediction about how effective a contribution might be given the predicted state of the user and the interpreted state of the environment.

Co-Creative AI Applications

Co-creative AI is uniquely positioned as a learning tool for novices to engage with authentic computing concepts (Freeman et al., 2014). Co-creative AI can be used in a variety of interactive art applications such as drawing, design, music, poetry, drama, and painting. There are a plethora of tools and technologies related to interactive art in co-creative AI, such as the Drawing Apprentice (Davis et al., 2014), EarSketch (Freeman et al., 2014), Shimon (Weinberg et al., 2009), and the Painting Fool (Colton et al., 2015). Co-creative AI can also serve as a tool to engage and inform the general public about artificial intelligence (e.g. AI literacy (Long & Magerko, 2020) and explainable AI (Ehsan & Riedl, 2020)) and its role in potentially augmenting their creative activities and vocations.