Generative AI Short Course

Last updated: February 28, 2025

By M. Ali Yousuf

[Disclaimer: The code on this page is NOT written by me and comes from the original author/website mentioned in each Google Colab notebook. I have made modifications, when needed, to make them run from within Google Colab, and some extra code/text added to explain the code.]

[Note: If you are new to the field of AI and have not learned it yet, there is a precursor to this course, "Machine Learning For Non-Computer Science Majors" which you can find here https://sites.google.com/view/aiandml4all/home]

Introduction

References

A note on Google Colab notebooks

What is Generative AI?

Text-to-Text

Prompting

Python Code Generation using CodeLlama

Python Code Debugging using CodeLlama

Prompting to Explain an Image

Auto-Completion Style Text Generation with GPT-2 Model

Translation from one (human) language to another

Text-to-Images

Text-to-images (Generate images using text prompts)

Stable Diffusion for Colorful Images

Illusion Diffusion for Illusive Images

Stable Diffusion to generate images

Generate Realistic Human Faces

Text-to-Music

Text-to-Video

Text-to-Video (Not free. See 'Comics Video Generator' below for a free option)

Comics Video Generator

Hailuoai / Minimax Video Generation

Voice/Speech-to-Text

Using a downloaded copy of Whisper

Summarize YouTube Reviews using Assembly.AI (paid account needed)

Miscellaneous

Sentiment Analysis / Classification using Scikit LLM (skllm)

Ollama and its various Models

Unsloth Notebooks

LLM Consortium

Introduction

This course is a collection of labs to give you some hands-on experience. Please note that I am NOT the author of these examples - I have just collected them here with some additional notes in the notebooks. You will find links to the original versions within the notebooks.

Generative artificial intelligence (AI) is a type of AI that uses existing data to create new and realistic content. This content can include text, images, audio, video, and more. Generative AI is different from other types of AI because it produces new data, rather than just analyzing it. The goal is to create content that is similar to what humans would create. [Text generated by Google AI].

Generative AI utilizes various machine learning methods to create new data. Here are some common ones:

Generative Adversarial Networks (GANs): Two neural networks compete, with one generating new data (generator) and the other evaluating its authenticity (discriminator).
Diffusion Models: These models progressively add noise to data, then learn to reverse the process, essentially denoising the data to create new content.
Variational Autoencoders (VAEs): These models encode data into a latent space, allowing for the manipulation and generation of similar data.
Transformers: While not exclusive to generative AI, transformers are powerful neural network architectures that can be used for various tasks, including generating different creative text formats.

References

Some of the sources used:

SuperAnnotate - Introduction to diffusion models for machine learning
TechTarget SearchEnterpriseAI - What is Generative AI? Everything You Need to Know
IBM - What is Generative AI?
Medium - Generative AI (Part-1)
Encord - An Introduction to Diffusion Models for Machine Learning

A note on Google Colab notebooks

When you will click on any file below, it will open and you will be able to make changes. However, you will NOT be able to save those changes as these are my files and you are only a viewer. To get full access, make a 'copy' by clicking on File -> 'Save a Copy in Drive'. That way, you will have a copy of the file in YOUR google drive, under the folder 'Colab Notebooks' (generally a yellow colored folder symbol). You can edit that version as much as you want and it will be saved on your drive. In case you mess up, go back to the original link to the file in my Google Drive and make another fresh copy!

What is Generative AI?

My own presentation on the topic: What is Generative AI (Google Slides)
Another of my presentations on Generative Art (Google Slides).
Understanding LLMs from Scratch Using Middle School Math, https://towardsdatascience.com/understanding-llms-from-scratch-using-middle-school-math-e602d27ec876
Generative AI exists because of the transformer (Financial Time article, very good): https://ig.ft.com/generative-ai/
The IBM Technology YouTube channel has many high-quality videos, all very short (less than 10 minutes each) describing various aspects of the field. Here are two starting points:
- What are Generative AI models? (IBM Technology), https://www.youtube.com/watch?v=hfIUstzHs9A&t=430s
- How Large Language Models Work (IBM Technology), https://youtu.be/5sLYAQS9sWQ?si=WLtiQsya0x6bQxFb

Text-to-Text

Prompting

In generative AI, prompting is the process of providing instructions or examples to an AI platform to produce a response.
We are going to use Replicate, and for that, you need to open an account at https://replicate.com/ and for that you need a Github account via https://github.com/. Github, in turn, may ask you to verify using an authentication app (I know it is annoying but at least it is not programming!)
After signing up to Replicate, DO NOT "Add payment method" but use "Run AI with an API" which is the free option.
you can access your API token by clicking on your profile (left-hand corner -> API Tokens). The token will be a long string of numbers and letters, starting with r8_YMm... Save it as you will need it in the following jupyter notebook.
Keep in mind that tokens are generally visible, even to the owner, only once. Hence you are supposed to write them somewhere for future reference.
Now run the following jupyter notebook:
https://colab.research.google.com/drive/1dy_-qF5SKxgZmF1xPQW8cgrc_q--jXZU?usp=sharing

Python Code Generation using CodeLlama

We'll use Together.AI and hence you need open an account and get the API key from them.
Here is the code to generate python code!

Python Code Debugging using CodeLlama

We'll use Together.AI and hence you need open an account and get the API key from them.
Here is the code to debug.

Prompting to Explain an Image

Prompting to explain an image 1, using OpenAI. Here we have imported an image from a website. If you click on that image twice, you will get to see the html behind it. You can then change that image to some other image.
This code tries to explain what it 'sees' in the image. The explanation appears in the last line but without text wrapping. Hence you have to scroll to the right to see the full explanation.
https://colab.research.google.com/drive/1llJp71lADh9Z0Pukv4ixkbhMHUcZPusP?usp=sharing

Auto-Completion Style Text Generation with GPT-2 Model

Find the original article here https://machinelearningmastery.com/auto-completion-style-text-generation-with-gpt-2-model/?utm_source=drip&utm_medium=email&utm_campaign=MLM+Newsletter+February+28%2C+2025&utm_content=Auto-Completion+Style+Text+Generation+with+GPT-2+Model+%E2%80%A2+Your+First+Machine+Learning+Project+in+Python+Step-By-Step
The corresponding Google Colab notebook: https://colab.research.google.com/drive/1bIaKh8RhIsCsk_R34a767y-r87EIxS4m?usp=sharing

Translation from one (human) language to another

I don't have time right now but here is a complete example. I'll convert it into Google Colab notebook later,
It uses OpenAI and HuggingFace (and Streamlit to convert to a web application)
https://www.analyticsvidhya.com/blog/2023/07/build-your-own-translator-with-llms-hugging-face/

Text-to-Images

Text-to-images (Generate images using text prompts)

Generative AI can generate images for you, based on a text prompt. You can try that in Google Slides where Gemini is built-in.
But we are going to use Python and will use OpenAI's API. Open AI offers a free token in the first 3 months. After that, you have to pay!
The following jupyter notebook, once run, does not show the image but rather the URL to the image. You can click on that link to see the image.
https://colab.research.google.com/drive/1_YLrIYb9hnvriViqzBIwqWXmq41eeXUO?usp=sharing

Stable Diffusion for Colorful Images

"The most advanced text to image generation service, Stable Image Ultra creates the highest quality images with unprecedented prompt understanding. Ultra excels in typography, complex compositions, dynamic lighting, vibrant hues, and overall cohesion and structure of an art piece. Made from the most advanced models, including Stable Diffusion 3, Ultra offers the best of the Stable Diffusion ecosystem." [From https://platform.stability.ai/docs/api-reference#tag/Generate/paths/~1v2beta~1stable-image~1generate~1ultra/post ]
To star, you must open an account at https://platform.stability.ai/ to get the API key.
Now open the Google Colab file (this file takes time to load and save): https://colab.research.google.com/github/stability-ai/stability-sdk/blob/main/nbs/Stable_Image_API_Public.ipynb#scrollTo=rgv9qma6OOue

Illusion Diffusion for Illusive Images

You need an account on HuggingFace (free), via https://huggingface.co/
Once you have an account, open the page https://huggingface.co/spaces/AP123/IllusionDiffusion
In this case, you can use it without Python programming, though an API is also available.
To understand its use, see the video: https://youtu.be/rhUZKqjdrO8?feature=shared
If you feel HuggingFace is slow, you can run the same app via Google Colab. See this notebook:
https://colab.research.google.com/drive/1rZmZ-jCGFZss0Dh2HwTqXQA7MmJcNbBS?usp=sharing
When you run it, it will give you a link as "Loaded as API: https://ap123-illusiondiffusion.hf.space". Clicking on it opens another webpage but nothing is visible. Refresh that page and you will see the same type of interface as you saw on HuggingFace but this one is now running on your Google Colab!

Stable Diffusion to generate images

Again, we'll use Hugging Face spaces to do that. Check https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium
We can do the same via Python / Google Colab API, https://colab.research.google.com/drive/1tjlNToUBJRCtNxj-liIXrqoVecKIc_Ls?usp=sharing
When you run it, it will give you a link that looks like "https://ap123-illusiondiffusion.hf.space". Clicking on it opens another webpage but nothing is visible. Refresh that page and you will see the same type of interface as you saw on HuggingFace but this one is now running on your Google Colab!

Generate Realistic Human Faces

We again use the free REPLICATE account
Run the Google colab notebook, Generate Realistic Human Faces.ipynb

Text-to-Music

Text-to-Music

Once again, we use Replicate to play music. Note that music appears as a weblink which you must click to listen to it.
Needless to say, it requires the API token from Replicate (the same one you used earlier).
https://colab.research.google.com/drive/1pn8rxRT1sUfP8cnaUymNLctoOWSjYyds?usp=sharin

Text-to-Video

Text-to-Video (Not free. See 'Comics Video Generator' below for a free option)

Though this is one of the coolest parts of GenAI, it cannot be done on normal PCs with CPUs or even on the free Google Colab. You must switch from CPU to GPU before running this file. And that requires a paid subscription to Google Colab.
The cheapest is $9.99 for 100 credits and these have to be used within 3 months. After that, you are not charged again until you decide to buy more credit.
Here we'll use HuggingFace and you need a free account there, https://huggingface.co/.
You will see the video generated in the left-hand column, underneath the folder 'sample_data' (but outside it). Download and run it.
The first part creates a very short video and takes 2 minutes to run. The second part creates a longer video with a longer prompt. Try both!
https://colab.research.google.com/drive/11uUkg5hnIV9eHKBPuDoDsmkZcZOTNOzn?usp=sharing

Comics Video Generator

Here again we'll use HuggingFace but will use their free GPUs to run the code.
Using HuggingFace, https://huggingface.co/spaces/ADOPLE/Video-Generator-AI
Or via Python API, https://colab.research.google.com/drive/1zhAFG0xb4OlsOve7x_fJeOUvVrJCQEIp?usp=sharing

Hailuoai / Minimax Video Generation

Generates Videos (Free, but the website is partly in Chinese but Video descriptions can be in English.
Article: This AI Tool Could Replace Filmmakers https://www.futureblueprint.xyz/p/ai-tool-replace-filmmakers
The web site to create videos is: https://hailuoai.com/video
See a Youtube video at https://www.youtube.com/watch?v=3ZGi4vp1NHM

Voice/Speech-to-Text

Using a downloaded copy of Whisper

Whisper is an automatic speech recognition (ASR) system developed by OpenAI that transcribes spoken language into text, https://openai.com/index/whisper/
You DON'T need an OpenAI account to run this code!
https://colab.research.google.com/drive/1HRHvTAPhsAJHCIMWFFPDIJJDr5tAwlYA?usp=sharing

Summarize YouTube Reviews using Assembly.AI (paid account needed)

A GoogleColab version will be added soon. This is just a video link: https://www.youtube.com/watch?v=sh7RRkYGts4&t=28s

Miscellaneous

Sentiment Analysis / Classification using Scikit LLM (skllm)

This one requires a paid account with OpenAI, no choice.
Get the key from https://platform.openai.com/organization/api-keys
Get the organization ID from https://platform.openai.com/settings/organization/general
Here is the Google Colab file, https://colab.research.google.com/drive/13RDW8wW9UH78KZRpWNmIeKmgqsAndWKf?usp=sharing
This model learns with just 30 samples of sentiments and labels!

Ollama and its various Models

First of all, download and install Ollama for your operating system (Available for macOS, Linux, and Windows (preview)), https://ollama.com/
If you have Windows, you can use Windows Subsystem for Linux (WSL) which is pre-installed on most modern Windows machines.
Once installed, you can download and run many LLM models on your computers. See the list of models here: https://ollama.com/library
There is a YouTube channel on Ollama with lots of good videos: https://www.youtube.com/@technovangelist
This video is to start with it: https://www.youtube.com/watch?v=2Pm93agyxx4

Unsloth Notebooks

Unsloth makes finetuning large language models like Llama-3, Mistral, Phi-3 and Gemma 2x faster, use 70% less memory, and with no degradation in accuracy!
Their notebooks run on (free) Google Colab and hence are easy to use. Just make a copy and use them.
https://docs.unsloth.ai/get-started/unsloth-notebooks
One example that I know runs on the free version of Google Colab is "Alpaca + TinyLlama + RoPE Scaling full example": https://colab.research.google.com/drive/1ZFSXnEf9o7kWJ1KqhsqDSD8Ny9wC8Ly3?usp=sharing

LLM Consortium

My copy of the code, https://colab.research.google.com/drive/1nV02Ppz7fjrAFERve19mYkXlm7Tb22YC?usp=sharing
The original at: https://colab.research.google.com/drive/1OnIipRwuHOZbKHN0haHGD0OnckBGfzqx
and the corresponding LinkedIn post is: https://www.linkedin.com/posts/masci_my-holiday-project-for-this-season-yet-another-activity-7282066767613480960-ye0o/?utm_source=share&utm_medium=member_desktop

Page updated

Google Sites

Report abuse