Hands-on Lab Practice

The dataset in this module is used for CAPTCHA letter recognition using CAPTCHA images.

The dataset can be found on the Kaggle website: https://www.kaggle.com/datasets/fournierp/captcha-version-2-images?resource=download

https://drive.google.com/file/d/1nD5OQR6NOrrkBjzUZw7RfFwWKehgjhOG/view?usp=sharing

In this Module, we will be implementing Convolutional Neural Network (CNN) for CAPTCHA Bypass in a new Google Collab notebook. In the dataset, a large number of images are there consisting of CAPTCHA for testing Convolutional Neural Network (CNN) for CAPTCHA Bypass.

Convolutional Neural network for captcha bypass

Let us first import all the necessary libraries that are required.

The fully ready google colab code can be found from this link to copy paste or run: https://colab.research.google.com/drive/1ic4fxhJ11ZlEDDzYJ0gs3_MRYNl_DCPO#scrollTo=l_9Jqw_1RD-a

Copy and paste the following link to open google colab

https://colab.research.google.com/notebooks/welcome.ipynb

Then click File --> New notebook

Click the red box area in the website and change the file name to Neural network algorithms for network DOS detection.ipynb

Next click the Runtime and change runtime type (in Hardware accelerator) to GPU (Because it will run faster than CPU).

Let us first import all the necessary libraries that are required

import os

import numpy as np

import matplotlib.pyplot as plt

from pathlib import Path

from collections import Counter

import tensorflow as tf

from tensorflow import keras

from tensorflow.keras import layers

Now, we will upload the dataset into the Google Colab. Image data comes in a folder, therefore, we will mount it on the gmail and then use the path directory of the dataset for further analysis.

Step 1: Download the dataset from Kaggle website . Then unzip it in your local machine.

Step 2: Upload the data into your google drive, for simplicity, you can upload the image folder into the MyDrive

Step 3. Mount google drive at your colab: run the following code into the colab

## mounting at google drive

from google.colab import drive

drive.mount("/content/gdrive")

Step 4: Connet to google drive --> choose your account where you upload the data --> allow to mount

Step 5: find the path directory and run the following code:

data_dir = Path("/content/gdrive/My Drive/your-data-folder-name/")

# Get list of all the images

images = sorted(list(map(str, list(data_dir.glob("*.png")))))

labels = [img.split(os.path.sep)[-1].split(".png")[0] for img in images]

characters = set(char for label in labels for char in label)

print("Number of images found: ", len(images))

print("Number of labels found: ", len(labels))

print("Number of unique characters: ", len(characters))

print("Characters present: ", characters)

# Batch size for training and validation

batch_size = 16

# Desired image dimensions

img_width = 200

img_height = 50

# Factor by which the image is going to be downsampled

# Hence total downsampling factor would be 4.

downsample_factor = 4

# Maximum length of any captcha in the dataset

max_length = max([len(label) for label in labels])

After run the code, the result indicates that the dataset contains 1040 captcha files as png images.

We will now Preprocessing the task using following codes:

# Mapping characters to integers

char_to_num = layers.StringLookup(

vocabulary=list(characters), mask_token=None

)

# Mapping integers back to original characters

num_to_char = layers.StringLookup(

vocabulary=char_to_num.get_vocabulary(), mask_token=None, invert=True

)

def split_data(images, labels, train_size=0.9, shuffle=True):

# 1. Get the total size of the dataset

size = len(images)

# 2. Make an indices array and shuffle it, if required

indices = np.arange(size)

if shuffle:

np.random.shuffle(indices)

# 3. Get the size of training samples

train_samples = int(size * train_size)

# 4. Split data into training and validation sets

x_train, y_train = images[indices[:train_samples]], labels[indices[:train_samples]]

x_valid, y_valid = images[indices[train_samples:]], labels[indices[train_samples:]]

return x_train, x_valid, y_train, y_valid

# Splitting data into training and validation sets

x_train, x_valid, y_train, y_valid = split_data(np.array(images), np.array(labels))

def encode_single_sample(img_path, label):

# 1. Read image

img = tf.io.read_file(img_path)

# 2. Decode and convert to grayscale

img = tf.io.decode_png(img, channels=1)

# 3. Convert to float32 in [0, 1] range

img = tf.image.convert_image_dtype(img, tf.float32)

# 4. Resize to the desired size

img = tf.image.resize(img, [img_height, img_width])

# 5. Transpose the image because we want the time

# dimension to correspond to the width of the image.

img = tf.transpose(img, perm=[1, 0, 2])

# 6. Map the characters in label to numbers

label = char_to_num(tf.strings.unicode_split(label, input_encoding="UTF-8"))

# 7. Return a dict as our model is expecting two inputs

return {"image": img, "label": label}

Now we will Create Dataset objects using the following code:

train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))

train_dataset = (

train_dataset.map(

encode_single_sample, num_parallel_calls=tf.data.AUTOTUNE

)

.batch(batch_size)

.prefetch(buffer_size=tf.data.AUTOTUNE)

)

validation_dataset = tf.data.Dataset.from_tensor_slices((x_valid, y_valid))

validation_dataset = (

validation_dataset.map(

encode_single_sample, num_parallel_calls=tf.data.AUTOTUNE

)

.batch(batch_size)

.prefetch(buffer_size=tf.data.AUTOTUNE)

)

Model prediction is required in order to understand how well the trained model works. By predicting each of the 5 characters with the test set the below result is found.

Visualize the data:

_, ax = plt.subplots(4, 4, figsize=(10, 5))

for batch in train_dataset.take(1):

images = batch["image"]

labels = batch["label"]

for i in range(16):

img = (images[i] * 255).numpy().astype("uint8")

label = tf.strings.reduce_join(num_to_char(labels[i])).numpy().decode("utf-8")

ax[i // 4, i % 4].imshow(img[:, :, 0].T, cmap="gray")

ax[i // 4, i % 4].set_title(label)

ax[i // 4, i % 4].axis("off")

plt.show()

Finally, the visualization of the above test captcha image along with the predicted result shows this model’s performance.

Page updated

Google Sites

Report abuse