This project is about applying deep learning-based computer vision techniques for creative optimization in mobile advertising. The project aims to explore the current trends in DCO and implement deep learning models in real creative Ad data and the total automation process from data pipeline to model monitoring and deployment.
Dynamic creativeÂ
is a method of programmatic advertising in which ad components such as headlines, descriptions, images, CTAs, etc. are changed in real-time according to parameters predefined by the advertiser. Common parameters include the time of day, weather, location, etc.
Dynamic creative optimization (DCO)Â
is a form of dynamic creative that uses machine learning technology to produce personalized experiences for viewers. This automated process leverages the available customer data and other connected data sources plus real-time testing and analytics to select the most effective combination of creative elements for each viewer.
There are many benefits of incorporating dynamic creative optimization as part of your digital marketing strategy. Here are the top four.
Personalization We know there is a direct correlation between personalized ads and sales. One study indicates that 80% of consumers are more likely to purchase if a brand provides a customized experience. Your audience likely has specific needs and preferences. With DCO you can automatically deliver the right message for your Millennial audience and the 50-plus crowd in the most effective creative form.
Automation Ad development used to be a manual, time-consuming, and expensive process — especially if you tried to curate ads for each of your audience segments.
Using dynamic creative optimization instead drives cost savings and increases the ROI on every marketing dollar. Once the technology is put in place and parameters are defined, DCO is a fully automated process. Marketers can add data sets, rules, and creative assets to the platform as needed and devote more time to other mission-critical tasks. But, of course, they’ll still need to monitor the performance of campaigns and make adjustments when necessary.
Real-time response Creating ads the old-fashioned way takes a lot of time and resources. DCO allows you to address the needs of your audience starting the moment they visit your site. Tailoring creatives based on real-time data drives more engagement and conversions.
DCO also allows brands to shift their messaging according to changing global or regional conditions. Changing course in real-time can avoid potential reputational damage from showing tone-deaf ads during a sensitive time (i.e., during natural disasters, a local tragedy, etc.).
Improved performance and ROI There’s no more waiting until after the campaign’s over to measure its performance and determine what to do better next time. Instead, DCO provides metrics and makes elemental adjustments to improve performance and conversions in real-time. And when ads are curated specifically for each user, conversion rates soar and increase the ROI of your ad spend.Â
This project focused on designing and building an algorithm to help an Ad Tech Company optimize its creative assets based on campaign performance data. Prior to this work, there was no systematic way to evaluate creatives during production or predict their performance when served to users. We developed a deep learning-based computer vision solution that segments objects from creative assets and relates them to key performance indicators (KPIs) of corresponding campaigns.Â
The project utilized proprietary dataset from the Ad Tech Company, containing numerous mobile advertisement creatives and their corresponding performance metrics. The goal was to create a predictive model that could evaluate creative effectiveness before deployment.
We implemented a comprehensive technical approach that included:
Setting up a development environment with Git version control
Analyzing the provided dataset structure and contents
Extracting assets from existing creative advertisements
Identifying essential features and preprocessing them for deep learning models
Developing and training predictive models with appropriate evaluation metrics
Deploying the models and assessing their performance
Automating the entire process as a pipeline
Building an interactive dashboard for ease of use
The workflow was organized into five broad sections: Asset Extraction and Preprocessing, Data Preprocessing, Exploratory Data Analysis, Model Development and Training, and Dashboard Development.
extracted the following assets from creative advertisements:
Beginning and ending video frames
Full advertisement video capture
Audio components
Position and size of clickthrough buttons
Position and size of logos
To automate this process the following are implemented:
Selenium for web interaction
PyAutoGUI to simulate user engagement (taps, swipes, scratches)
Tesseract OCR to extract text instructions from image frames
FFmpeg for video recording and processing
For complex interactions requiring human input, We implemented a semi-automated system that waited for human interaction and recorded results once interaction was detected.
extracted and processed multiple feature types:
Audio Features:
Duration (milliseconds)
Intensity (dBFS)
Frame count
Text Features:
Transcribed audio using OpenAI's Whisper
Sentiment analysis using NLTK's SentimentIntensityAnalyzer
Word count
Image Features:
Emotion detection from beginning/ending frames using DeepFace
Estimated age of subjects
CTA button position (categorized into 9 regions)
Dominant colors (top 5) from image frames
All features were consolidated into a single DataFrame for the machine learning pipeline.
The exploratory data analysis revealed several important patterns:
Distribution patterns in word count, CTR, and ER
Sentiment category distribution across creatives
CTA button positioning preferences
Relationships between audio features and engagement metrics
Correlations between visual elements and click-through rates
These insights informed the feature selection process for model development and provided valuable information about what creative elements most influenced user engagement.
Implemented various types of approaches to predict CTR and ER values:
LSTM Neural Network Architecture:
Window size of 45 for sequential data processing
Two separate models for CTR and ER prediction
Training over 10 epochs with batch optimization
Architecture included:
Input LSTM layer with 8 units (return sequences enabled)
Second LSTM layer with 4 units
Dense output layer
Model training was performed with MSE loss function and Adam optimizer, with validation data to monitor performance and prevent overfitting. The models showed steady convergence in training and validation metrics.
We also developed an interactive dashboard using Streamlit to make the solution accessible to creative designers and marketing managers. The dashboard offers:
Summary of model performance and project information
Multiple prediction methods:
Video-based prediction
Image-based prediction
URL-based prediction (directly from creative ad links)
Visual feedback on predictions
Feature importance visualization
For URL-based predictions, we implemented a complete data pipeline that:
Takes a given creative ad link
Uses Selenium to extract assets
Applies feature extraction steps
Generates a CSV with all defined features
Predicts CTR and ER values using the trained models
This dashboard enables non-technical users to leverage the predictive power of the models without understanding the underlying complexity.
Please wait until the site loads the Video, to see the dashboard.