As our first introduction to AI, we demonstrate the process of building and training a simple neural network in PyTorch.
Neural Network, a type of machine learning model: inspired by the structure and function of biological neural networks in the human brain. They of interconnected nodes, or artificial neurons, organized into layers. These layers process information in a similar way to how biological neurons process signals.
DEEP Learning Framework
import torch
import torch.nn as nn
import torch.nn.functional as F
Here we import Pytorch packages and shorten them.
torch.nn Module: provides building blocks for neural networks like layers(nn.Linear), activation functions(nn.ReLU), and loss functions(nn.MSELoss).
torch.nn.functional Module: Offers functional versions of operations(F.relu) for more flexibility.
GPU Acceleration: utilizes GPUs to accelerate computations, making training and inference faster.
Automatic Differentiation: provides automatic differentiation, which simplifies the process of computing gradients for backpropagation.
Dynamic Computation Graph: allows for more flexibility during model development and debugging. It constructs the graph as you execute operations, making it easier to modify the model on-the-fly.
Optimized Libraries: includes optimized libraries for operations like linear algebra and tensor computations, contributing to overall performance.
Input -> Neuron -> Output
A Replica of the Human Brain: by understanding how human brains function, and replicating it, we can create artificial neurons that power AI.
Simplest neural network: consisting of one input layer and one neuron, this neural network is called a perceptron.
Each input is multiplied with a weight (initialized with a random value) and the results are added together. The sum is then passed through an activation function which resembles the nucleus of the human nervous system neuron.
Constructed from 3 Layers
Input layer-
Initial data for the neural network.
Hidden layers -
An intermediate where all computation is done, serving as the transformation to create more accurate results.
Uses nn.linear to create linear hidden layers.
Output layer -
produce the result for given inputs.
Uses nn.Sequential to stack layers together.
Real-life Use: The neural network above predicts the probability of survival of a person. Though the above seem simple, in real-life, neural networks have billions of nodes per layer and hundreds of hidden layers. Neural networks with 2 or more hidden layers are called Deep Neural Networks which is why this field is named Deep Learning.
A linear model is a function with a constant rate of change. For example the above's equation is y = x. They are simple and easy to interpret, but they may not capture the complexity and non-linearity of some phenomena.
model = nn.Sequential(
nn.Linear(1, 128),
nn.Linear(128, 1)
Our linear model: This model has 2 linear layers, the first takes 1 input and gives a 128-dimensional output, while the second takes the 128-dimensional input from the first and gives 1 output.
A comparison of our networks: Modelling is crucial for understanding how the network learns and how well it approximates the target function.
Non-linear models have a function that has a variable rate of change. The slope of the curve is not constant, but depends on the value of x. They are more flexible and can fit more diverse and complex data, but may also be more difficult to analyze and understand.
model2 = nn.Sequential(
nn.Linear(1, 128),
nn.ReLU(),
nn.Linear(128, 1)
Our Non-linear model: In between our 2 linear layers, this model adds the activation function ReLU, causing it to be more complex and non-linear.
class NonLinearNet2(nn.Module):
def __init__(self):
super(NonLinearNet2, self).__init__()
self.fc1 = nn.Linear(1, 168)
self.fc2 = nn.Linear(168, 168)
self.fc3 = nn.Linear(168, 64)
self.fc4 = nn.Linear(64, 1)
Forward method in NonLinearNet2: defines how data flows through the network, applying layers and activation functions sequentially. Thereby increasing the network's depth and width, allowing it to fit non-linear functions like sin(x), cos(x), and sin(x^2).
y = torch.cos(X ** 2) + torch.sin(X ** 2)
opt2 = torch.optim.Adam(model2.parameters(), lr=0.01)
losses2 = train(model2, crit2, opt2, X, y, num_epochs=1000)
opt2 = torch.optim.Adam(model2.parameters(), lr=0.001)
losses2 = train(model2, crit2, opt2, X, y, num_epochs=10000)
Optimizer torch.optim.Adam : employed to adjust model parameters during training, minimizing the loss function. By decreasing the learning rate(0.01 -> 0.001) and increasing the training amount(1000 -> 10000) we can make our function predict better.
Over a year of experience in Python: as a Python novice I still find Neural Networks a topic difficult to follow. In this lesson I couldn't grasp the deep and intricate inner-workings of AI but I grasped its fundamentals.