As a subset of machine learning, Generative AI’s models can learn and uncover complicated patterns in massive datasets.
outsider_image = outsider.reshape(48, 48)
Reshapes the pixels of the image
plt.imshow(outsider_image, cmap='gray')
plt.title('Outsider Image')
plt.show()
Displays the example image from our dataset in gray colors
def generate_face_image(latent1, latent2, latent3):
latent_vars = np.array([[latent1, latent2, latent3]])
reconstruction = np.array(face_decoder(latent_vars))
reconstruction = reconstruction.reshape(48, 48, 1)
plt.figure()
plt.imshow(reconstruction, cmap='gray')
plt.axis('off')
plt.show()
Allows us to plot our own constructed image.
face_image_generator = widgets.interact(
generate_face_image,
latent1=(latent1_min, latent1_max),
latent2=(latent2_min, latent2_max),
latent3=(latent3_min, latent3_max),
Using ipywidgets to create 3 sliders we can interact to construct and deconstruct our images to create thousands of variations.
Fashion VAE
def build_vae(num_pixels, num_latent_vars=2):
encoder = tf.keras.Model(inputs=encoder_inputs, outputs=z)
decoder = tf.keras.Model(inputs=decoder_inputs, outputs=reconstruction)
Here we build our VAE by creating an encoder and a decoder. To deconstruct and reconstruct our images.
An example image that has been reconstructed to be whole.
Example images from our fashion MNIST dataset.
Uses Ipywidget to create an interactive widget with 2 latent sliders.
import matplotlib.pyplot as plt
latent_space = fashion_encoder.predict(x_train)
plt.figure(figsize=(8, 6)) # Adjust figure size if needed
plt.scatter(latent_space[:, 0], latent_space[:, 1], c=y_train, cmap='viridis') # Color by labels
plt.colorbar(label='Labels') # Add colorbar
plt.xlabel('Latent Dimension 1')
plt.ylabel('Latent Dimension 2')
plt.title('Latent Space Scatter Chart')
plt.show()
Takes the encoded representation of the Fashion MNIST dataset within the latent space and visualizes it as a scatter plot. Each point in the plot represents an image from the dataset, and its position is determined by the values of its latent variables. The points are colored according to their corresponding labels, allowing us to see how different classes of images are clustered or separated within the latent space.
Variational autoencoder(VAE): proposed in 2013 by Diederik P. Kingma and Max Welling at Google and Qualcomm. A VAE provides a probabilistic manner for describing an observation in latent space. Thus, rather than building an encoder that outputs a single value to describe each latent state attribute, we’ll formulate our encoder to describe a probability distribution for each latent attribute. It has many applications, such as data compression, synthetic data creation, etc.
Allows us to generate variations of an image from one input.
Creates variations in-between 2 images.
Architecture of VAE
The encoder-decoder architecture lies at the heart of Variational Autoencoders (VAEs), distinguishing them from traditional autoencoders. The encoder network takes raw input data and transforms it into a probability distribution within the latent space.
The encoder generates the latent code which is a probabilistic encoding, allowing the VAE to express not just a single point in the latent space but a distribution of potential representations.
The decoder, in turn, takes a sampled point from the latent distribution and reconstructs it back into data space. During training, the model refines both the encoder and decoder parameters to minimize the reconstruction loss – the disparity between the input data and the decoded output. The goal is not just to achieve accurate reconstruction but also to regularize the latent space, ensuring that it conforms to a specified distribution.
Encoder can transform the image of the little girl into a vector [1.2, -0.5].
Decoder transforms vector [1.2, -0.5] back to the little girl, not to the same image, but to a slightly different one, a variation of the original.