Read Open Sourcing the Origin Stories: "The ml5.js Model and Data Provenance Project" by Ellen Nickles and reflect on the the following questions:
What questions do you still have about the model and the associated data?
I still wonder about the exact origins and conditions of the datasets behind the ml5.js models. For example, which specific dataset versions were used, and under what licenses are they distributed? It is also unclear how much demographic diversity is represented in these datasets, and whether certain groups or contexts are underrepresented. I would like to know more about the preprocessing steps and training configurations, since these choices shape how the models see and interpret data. What's more, I'm also curious about how failures and biases have been tracked so far, and whether there is a plan for communicating updates or corrections to users.
Are there elements you would propose including in the biography?
Licensing and Reuse Conditions
Understanding the license of each dataset and model is crucial for artists and educators. A clear statement of the license not only defines whether work can be shown publicly or used commercially, but also reveals who is excluded from reuse. When licenses are vague or restrictive, they can unintentionally limit accessibility and reproduce inequalities by privileging certain communities over others. Including precise license terms in each model biography helps users make informed creative and ethical choices.
Evaluation and Failure Modes
Documenting failure experiences goes beyond reporting overall accuracy. It requires exposing where the model breaks down and what kinds of mistakes appear most often. By publishing anecdotal user reports and known blind spots, provenance records transform bias from a hidden flaw into an explicit design consideration. Artists can then engage critically by using glitches as material or openly addressing the limitations while avoiding misrepresentation.
Update Policy and Fairness
Models and datasets evolve, so a transparent update policy is essential for fairness and reproducibility. Version numbers, changelogs, and access to older checkpoints allow creators to trace how performance and bias shift over time. In this way, the provenance record remains a living tool, supporting both ethical accountability and artistic experimentation.
How does understanding the provenance of the model and its data inform your creative process?
Understanding the provenance of a model and its data helps me see machine learning not as a neutral tool but as a material with its own history and politics. Knowing where the data comes from, who collected it, and under what circumstances makes me more attentive to the values embedded in the outputs. This awareness encourages me to approach models critically rather than take their results at face value. In practice, this shifts how I design interactive works. Instead of simply asking “what can the model generate?”, I also ask “whose voices and perspectives are represented in this model, and whose are missing?” That line of questioning often leads me to highlight absence, distortion, or bias as part of the artistic experience. Provenance becomes a creative constraint that guides not only technical choices but also the narrative and conceptual framing of a project. At the same time, provenance makes my process more transparent and accountable. By understanding the data lineage, I can better explain to audiences why a model behaves in certain ways and what its limitations are. This openness fosters trust and helps me use machine learning in ways that are both aesthetically engaging and ethically responsible.
Notes
Mixing Movement and Machine - Maya Man
Embodiment as interface: The body itself can become both the subject and input for AI, reminding you that movement is data but also meaning. Think about how gestures, posture, or dance can shape interaction beyond simple keypoints
Agency and interactivity: Giving the performer or audience control over aspects of AI output suggests ways to co-create with models rather than treat them as deterministic machines.
The point is not "What are all the things it can do", but "What do I want from it."
Humans of AI - Philipp Schmitt
Visibility of hidden labor: Every AI model is built on human work. This encourages reflecting on who or what is invisible in your own projects, and how to make contributions or biases transparent.
Provenance as storytelling: Treat datasets and models as narrative objects. Each image, label, or data point has a story. Consider how exposing that history can enrich our work conceptually and ethically.
My project this week called "Move your body". In this project, I created an interactive experience that responds to the user’s body movements captured through a webcam. Using the ml5.js BodyPose model, I tracked keypoints of the user’s body. I create seven points appear on the canvas, each associated with a unique sound. When the user’s hand collides with a point, its sound plays while the background music temporarily pauses. The points also change their position after being triggered, creating a dynamic and playful interaction.
Through this project, I deepened my understanding of:
Pose detection and real-time tracking with ml5.js and p5.js.
Interactive audio programming, including playing multiple sounds and managing background music.
Animation principles such as dynamic positioning, blinking effects, and visual feedback based on user input.
Structuring a p5.js project, including class-based design for the points and handling multiple interactive elements.
Potential Improvements:
Implement smoother audio transitions, so that triggering point sounds feels more seamless with background music.
Enhance visual feedback by customizing colors, sizes, and animations of points and skeleton dynamically based on user movement.
Add more complex interactions, such as multiple hands triggering different visual or audio effects simultaneously.