Inspired from the brain architecture, neural networks are composed of neurons and connexions.
We feed the input layer with an image, and we get a prediction on the output layer.
All the layers are connected by some float values, called weights, representing the importance of the connexion.
Knowing the output will allows us to correct the weights in the network, using a theoretical relation. This method is based on the gradient descent.
With these corrections, the network will make fewer errors as it is trained.
For the purpose of the project, we chose the Unet convolutional network, described in the Useful Links section. This network uses convolutions to extract relevant patterns out of the picture, to be able to distinguish between different shapes, and recognise some information.
Convolution
Transposed convolution
Blue maps are the input, green maps are the output
https://github.com/vdumoulin/conv_arithmeticA kernel matrix, which is basically a filter, is passing over the image, computing the pixels values, to give an output. As the training is progressing, these filters will adapt themselves to extract patterns on the image, such as high gradients areas or a different texture.
The values inside these kernels will be learned by the network and improved during training.
The activation function introduces non-linearities in the network to enlarge the space of solutions. In the Unet implementation, the ReLU function is considered.
To reduce the size of an image, we take the max value of each 3x3 voxel to give 1 pixel in the output.
Max Pool 2D