Concept
“Poetic Drawing” is an interactive digital artwork that transforms a user’s drawing into poetry. The idea came from the intersection of creativity and AI that how a simple visual gesture can evoke language and emotion through machine interpretation. Instead of having AI create images, this project reverses the direction: the human draws, and the AI responds with a poetic reflection.
Tools & Technologies Used
p5.js – for building the interactive drawing interface and UI layout.
Gemini API (Google Generative Language) – for generating poetic text responses based on the user’s drawing.
JavaScript – for program logic and event handling.
Process
Interface Design
I first designed a clean drawing interface with a fixed canvas, centered on the page. I added tools for pen, eraser, color selection, brush size adjustment, undo, and clear.
Drawing Layer Setup
A separate p5 graphics layer (createGraphics) was used so that the strokes could be redrawn, erased, or undone easily without affecting the main UI layout.
Gemini Integration
I connected the Gemini 2.0 Flash model through its REST API. The code sends the canvas image (base64-encoded PNG) along with a text prompt asking Gemini to write a poetic description.
Display System
The generated poem appears under the drawing, styled inside a soft purple box. The system updates live when the user presses "Generate Poem.”
Testing & Refinement
I debugged several issues, including image upload errors, API version mismatches, and missing function definitions (e.g., resetDrawingLayer()). After cleaning up the logic, the app became stable and responsive.
What I Learned
How to integrate AI language generation with a creative coding environment like p5.js.
The importance of UI flow and user experience that small things like undo or clear significantly improve interaction.
Error handling and debugging APIs in the browser environment.
Challenges
Gemini API access: The model endpoints and versions changed several times, which caused repeated 404 errors until I switched to a working version (v1beta/gemini-2.5-flash).
Front-end restrictions: Gemini API doesn’t allow direct front-end requests easily, so I had to find a workaround to test it in a controlled environment.
Drawing logic: Managing stroke history for undo/redo required storing each segment with color, size, and tool information.
Future Improvements
Add AI interpretation of the drawing, e.g., using Gemini or Vision models to guess what the drawing represents and display a “This looks like...” label next to the poem.
Add animated transitions when poetry appears, for a more emotional response.
Improve font and layout for better aesthetics, maybe using handwriting-style fonts.
Expand to multi-modal interaction (e.g., sound reacting to strokes or poetic voice narration).
Reflection
This project showed me how AI can act not as a replacement for human creativity, but as a poetic collaborator. Each time I drew something the AI responded with unexpected language that made me see my own drawing differently. It’s a small experiment in human–machine co-creation.
https://editor.p5js.org/VivianaviVi/sketches/sjdlUCRXz