Advanced Applied Deep Learning

Practice Course

Sheng Yun Wu

Week 4: Regularization Techniques and Preventing Overfitting

In Week 4, students focus on regularization techniques to prevent overfitting, improve generalization, and stabilize the training of deep learning models. The examples cover common methods like dropout, L2 and L1 regularization, data augmentation, and early stopping. These practices are essential when working with deep learning models on small or noisy datasets.

Example 1: Introduction to Overfitting and Underfitting

Description:
This example demonstrates the concepts of overfitting and underfitting using a small neural network. Students will observe how models that are too complex overfit the training data, while simpler models underfit.

import tensorflow as tf

from tensorflow.keras import models, layers

# Load MNIST dataset

(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()

# Reshape and normalize the data

train_images = train_images.reshape((60000, 28, 28, 1)).astype('float32') / 255

test_images = test_images.reshape((10000, 28, 28, 1)).astype('float32') / 255

# Display shape of the dataset

print(f"Train images shape: {train_images.shape}")

print(f"Test images shape: {test_images.shape}")

# Build a small underfitting model

underfit_model = models.Sequential([

layers.Flatten(input_shape=(28, 28, 1)),

layers.Dense(16, activation='relu'),

layers.Dense(10, activation='softmax')

])

# Build a larger overfitting model

overfit_model = models.Sequential([

layers.Flatten(input_shape=(28, 28, 1)),

layers.Dense(512, activation='relu'),

layers.Dense(10, activation='softmax')

])

# Compile the models

underfit_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

overfit_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train the models

underfit_model.fit(train_images, train_labels, epochs=5, validation_data=(test_images, test_labels))

overfit_model.fit(train_images, train_labels, epochs=5, validation_data=(test_images, test_labels))

Example 2: Using Dropout for Regularization

Description:
This example introduces the dropout technique, which helps prevent overfitting by randomly dropping units (neurons) during training.

# Build a model with dropout layers

dropout_model = models.Sequential([

layers.Flatten(input_shape=(28, 28, 1)),

layers.Dense(512, activation='relu'),

layers.Dropout(0.5),

layers.Dense(512, activation='relu'),

layers.Dropout(0.5),

layers.Dense(10, activation='softmax')

])

# Compile and train the model with dropout

dropout_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

dropout_model.fit(train_images, train_labels, epochs=5, validation_data=(test_images, test_labels))

Example 3: Implementing L2 (Ridge) Regularization

Description:
This example shows how to apply L2 regularization, also known as ridge regularization, to penalize large weights and prevent overfitting.

# Build a model with L2 regularization

l2_model = models.Sequential([

layers.Flatten(input_shape=(28, 28, 1)),

layers.Dense(512, kernel_regularizer='l2', activation='relu'),

layers.Dense(10, activation='softmax')

])

# Compile and train the model with L2 regularization

l2_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

l2_model.fit(train_images, train_labels, epochs=5, validation_data=(test_images, test_labels))

Example 4: Applying L1 (Lasso) Regularization

Description:
This example introduces L1 regularization, which encourages sparsity in the model's weights by penalizing the absolute value of weights.

# Build a model with L1 regularization

l1_model = models.Sequential([

layers.Flatten(input_shape=(28, 28, 1)),

layers.Dense(512, kernel_regularizer='l1', activation='relu'),

layers.Dense(10, activation='softmax')

])

# Compile and train the model with L1 regularization

l1_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

l1_model.fit(train_images, train_labels, epochs=5, validation_data=(test_images, test_labels))

Example 5: Combining Dropout and L2 Regularization

Description:
This example combines both dropout and L2 regularization in a single model to demonstrate how these regularization techniques work together to improve generalization.

# Build a model with both dropout and L2 regularization

dropout_l2_model = models.Sequential([

layers.Flatten(input_shape=(28, 28, 1)),

layers.Dense(512, kernel_regularizer='l2', activation='relu'),

layers.Dropout(0.5),

layers.Dense(512, kernel_regularizer='l2', activation='relu'),

layers.Dropout(0.5),

layers.Dense(10, activation='softmax')

])

# Compile and train the model with dropout and L2 regularization

dropout_l2_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

dropout_l2_model.fit(train_images, train_labels, epochs=5, validation_data=(test_images, test_labels))

Example 6: Early Stopping to Prevent Overfitting

Description:
Early stopping is a technique where the training process is stopped as soon as the validation performance stops improving, thus preventing overfitting.

from tensorflow.keras.callbacks import EarlyStopping

# Add early stopping callback

early_stopping = EarlyStopping(monitor='val_loss', patience=3)

# Build a simple model

model = models.Sequential([

layers.Flatten(input_shape=(28, 28, 1)),

layers.Dense(512, activation='relu'),

layers.Dense(10, activation='softmax')

])

# Compile and train the model with early stopping

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

model.fit(train_images, train_labels, epochs=20, validation_data=(test_images, test_labels), callbacks=[early_stopping])

Example 7: Data Augmentation for Regularization

Description:
This example introduces data augmentation, a technique to artificially increase the diversity of the training dataset by applying random transformations to the images.

from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Define data augmentation generator

datagen = ImageDataGenerator(

rotation_range=40,

width_shift_range=0.2,

height_shift_range=0.2,

shear_range=0.2,

zoom_range=0.2,

horizontal_flip=True,

fill_mode='nearest'

)

# Train the model using augmented data

model.fit(datagen.flow(train_images, train_labels, batch_size=64), epochs=5, validation_data=(test_images, test_labels))

Example 8: Batch Normalization for Improved Training

Description:
This example demonstrates how to use batch normalization to speed up training and reduce overfitting by normalizing activations during training.

# Build a model with batch normalization

batch_norm_model = models.Sequential([

layers.Flatten(input_shape=(28, 28, 1)),

layers.Dense(512, activation='relu'),

layers.BatchNormalization(),

layers.Dense(512, activation='relu'),

layers.BatchNormalization(),

layers.Dense(10, activation='softmax')

])

# Compile and train the model with batch normalization

batch_norm_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

batch_norm_model.fit(train_images, train_labels, epochs=5, validation_data=(test_images, test_labels))

Example 9: Visualizing the Effects of Regularization

Description:
Students use this example to visualize the learning curves (training and validation loss) to observe the effects of applying various regularization techniques, such as dropout, L2, and L1.

import matplotlib.pyplot as plt

# Train a model with dropout

history = dropout_model.fit(train_images, train_labels, epochs=10, validation_data=(test_images, test_labels))

# Plot training and validation loss

plt.plot(history.history['loss'], label='Training Loss')

plt.plot(history.history['val_loss'], label='Validation Loss')

plt.title('Training and Validation Loss with Dropout')

plt.xlabel('Epochs')

plt.ylabel('Loss')

plt.legend()

plt.show()

Example 10: Hyperparameter Tuning for Regularization

Description:
This example introduces hyperparameter tuning for regularization techniques, such as adjusting dropout rates or L2 penalty strength, to find the best combination for the model.

from sklearn.model_selection import GridSearchCV

from tensorflow.keras.wrappers.scikit_learn import KerasClassifier

# Define a function to build the model with hyperparameters

def build_model(dropout_rate=0.5, l2_strength=0.001):

model = models.Sequential([

layers.Flatten(input_shape=(28, 28, 1)),

layers.Dense(512, kernel_regularizer=tf.keras.regularizers.l2(l2_strength), activation='relu'),

layers.Dropout(dropout_rate),

layers.Dense(512, kernel_regularizer=tf.keras.regularizers.l2(l2_strength), activation='relu'),

layers.Dropout(dropout_rate),

layers.Dense(10, activation='softmax')

])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

return model

# Wrap the model for use with scikit-learn

model = KerasClassifier(build_fn=build_model, epochs=5, batch_size=64, verbose=0)

# Define the hyperparameter grid

param_grid = {'dropout_rate': [0.3, 0.5, 0.7], 'l2_strength': [0.001, 0.01, 0.1]}

# Perform grid search

grid = GridSearchCV(estimator=model, param_grid=param_grid, cv=3)

grid_result = grid.fit(train_images, train_labels)

# Print best parameters

print(f"Best parameters: {grid_result.best_params_}")

Week 4 Summary

Objective: Learn and apply various regularization techniques to prevent overfitting and improve generalization in deep learning models.
Skills Developed:
- Understanding the principles of overfitting and underfitting.
- Hands-on implementation of dropout, L1/L2 regularization, early stopping, batch normalization, and data augmentation.
- Visualizing the effects of regularization techniques on model performance.
Tools: TensorFlow, Keras, Matplotlib.

These 10 examples in Week 4 help students understand how regularization techniques can improve model performance and generalization. By experimenting with different methods, they gain a practical understanding of how to apply these techniques in real-world scenarios.

Page updated

Report abuse