Full Blog posts available at Medium
In deep learning, the Generative Adversarial Model is one of the growing research areas and has been used successfully to generate high-resolution images purely from noise. Other classical models used a discriminative approach and classified the output based on the probability density function f(X, Y) = P(Y|X), which defines the probability of one class given some already known data. At the same time, generative networks model joins the probability distribution of one class P(Y; X), provided some already known data [1]. As a potential example, the discriminative model performs to label a given Image as a cat while the generative model allows generating new images of cats. Almost all generative models are based on adversarial training and used for applications like image generation, image translation, text generation, text translation, etc.
The terminology used here will help to understand the research gap in the next section.
GAN consist of two neural architectures, generator G and discriminator D, as shown in figure 1. The generator’s job is to produce new images purely from noise vector z. The discriminator’s job is to classify these generated images as real or fake (real refers to images from the original dataset, and fake refers to images generated from noise vectors). The ultimate goal of a generator is to produce so realistic images that the discriminator is unable to classify them as fake. This is done through backpropagation, where the generator’s parameters improve with each step so that the generated image looks more realistic. Thus, the generator and discriminator are playing a minimax game with the following loss function of regular GAN [1].
Generative Adversarial Networks (GANs)[1] training faced several challenges, including mode collapse, dynamic equilibrium, non-convex nature of the optimization problem, and instability while training. Let’s understand each of these in detail one by one:
GAN training involves a dynamic equilibrium between the generator and discriminator networks. As one network improves, the other must adapt to maintain the balance. This dynamic nature makes it challenging to determine when the training process has converged, as the networks may continue to evolve and improve even after the loss values stabilize. The training process can be formulated as a minimax game where the generator aims to minimize the discriminator’s ability to distinguish between real and generated samples, while the discriminator aims to maximize its ability to differentiate between the two types of samples.During training, the generator and discriminator losses may oscillate, making it difficult to determine whether the training process has stabilized or the networks are still improving.
Non-Convex Optimization:
GANs involve a non-convex optimization problem. The objective function being optimized is non-convex due to the adversarial nature of the training process. This means that the loss landscape contains multiple local minima, making it difficult to find the global optimum. The non-convexity of the objective function can lead to convergence issues and make it challenging to train GANs effectively.
Artificial Intelligence or AI, refers to the development of intelligent machines that can perform tasks that typically require human intelligence, such as learning, problem-solving, perception, and decision making. AI is achieved through the development of algorithms and statistical models that enable machines to analyze and interpret large amounts of data and perform tasks that were previously only possible for humans.
Let understand with example of self-driving cars. Self-driving cars are equipped with a variety of sensors, such as radar, lidar, and cameras, that allow them to detect and respond to their surroundings. The data from these sensors is analyzed by AI algorithms, which enable the car to make decisions in real-time about steering, acceleration, and braking.
For instance, if a self-driving car approaches a stop sign, the AI algorithm will recognize the sign, interpret its meaning, and instruct the car to slow down and stop at the appropriate distance from the sign. Similarly, if a pedestrian suddenly steps out into the road, the car’s AI algorithms will detect the pedestrian, recognize the potential danger, and take evasive action to avoid a collision.
The development of self-driving cars is a complex process that requires the integration of multiple AI systems, including computer vision, natural language processing, and machine learning. While there are still technical and regulatory challenges to overcome, the potential benefits of self-driving cars are significant, including reduced traffic congestion, improved safety, and increased mobility for people who cannot drive themselves.
Other examples of AI in our society include voice assistants like Siri and Alexa, image recognition technology used in security cameras, and fraud detection algorithms used by banks and credit card companies.
Deep learning(DL) is a subset of machine learning that involves training artificial neural networks with multiple layers to learn and identify patterns in data. In healthcare, deep learning has been applied to various tasks, including medical imaging analysis, and electronic health records analysis, and drug discovery. Deep learning algorithms have been shown to be highly effective at detecting and diagnosing various diseases, including cancer, cardiovascular diseases, and neurological disorders, from medical images such as X-rays, CT scans, and MRIs. Deep learning algorithms can also be used to segment images and identify specific anatomical structures or lesions. This can help doctors and researchers develop more effective treatments and improve patient outcomes.
Deep learning algorithms can also be applied to electronic health records (EHRs) to identify patterns in patient data and assist with clinical decision-making. For example, deep learning algorithms can be used to predict the likelihood of readmission or identify patients at risk of developing a specific disease based on their medical history.
Applications:
Deep learning algorithms have been shown to be highly effective at analyzing medical images such as X-rays, CT scans, and MRIs. These algorithms can detect and diagnose various diseases and conditions, including cancer, cardiovascular diseases, and neurological disorders. Here are some examples of how deep learning algorithms are used for medical image analysis:
Image segmentation: DL can be used to segment medical images, which means separating different anatomical structures or regions of interest within an image. From an MRI scan of the brain, a DL algorithm i.e., Unet can identify and segment different parts of the brain, such as the cortex, white matter, and cerebellum.
Let’s first understand what is RNN.
Recurrent neural networks (RNN) are a class of neural networks that is powerful for modeling sequence data such as time series or natural language. Schematically, an RNN layer uses a for loop to iterate over the time steps of a sequence, while maintaining an internal state that encodes information about the time steps it has seen so far.
In a feed-forward neural network or ANN, the information only moves in one direction — from the input layer, through the hidden layers, to the output layer. The information moves straight through the network and never touches a back neuron again because they don’t have memory cell to save the once computed output from first input they receive. That why ANN are bad at predicting what’s coming next, specially when dealing with sequential data.
In a RNN the information cycles through a loop. When it makes a decision, it considers the current input as well as the previous computed output also i.e. what it has learned from the previous inputs. That why perform better compare to ANN.
First, Import necessary libraries used to read data and buid the network.
# Import libraries from pandas
import numpy as np
from keras.models
import Sequential
from keras.layers
import Dense, SimpleRNN
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt
import read_csv
import math
Convolutional Neural Networks (CNNs) are a type of artificial neural network that is commonly used in image and video processing tasks. They are designed to automatically and adaptively learn spatial hierarchies of features from the images.
CNNs consist of several layers, each of which performs a specific task in the overall process. The main layers of a CNN are:
Input layer: This layer takes in the input image, which is a matrix of pixel values.
The input layer simply takes in the input image, which is represented as a matrix of pixel values. Let x be the input image, then:
x = [x1, x2, …, xn]
where n is the total number of pixels in the image.
Convolutional layer: This layer applies a set of filters (also called kernels) to the input image. Each filter extracts a specific feature from the image by sliding over the image and performing element-wise multiplication and addition operations.
Let w be the filter, b be the bias term and f be the activation function, then the output feature map y can be calculated as follows:
y = f(w * x + b)
where * denotes the convolution operation.
After this, we apply a nonlinear activation function to the output of the convolutional layer. This helps to introduce nonlinearity into the model and allows it to capture complex patterns in the input data.
Some common activation functions include ReLU, sigmoid and tanh. Let g be the activation function, then the output of the activation layer can be written as:
z = g(y).