negative log loss and cross entropy in pytorch
NLLLoss
negative log likelihood loss
Let say the output of a neural network is a C-size vector, representing C classes
Firstly it needs to compute / simulate a probability distribution over the C classes using softmax
i.e. exp(i)/sum(exp(j)) so each of the C elements is given a score between 0 and 1, and all elements scores add up to 1
When the score can be very small, so it can overflow a float variable, especially when you times many scores together.
So we apply log on the score. In pytorch, there is a LogSoftmax function that does both softmax and log.
The output of the neural network then can be transformed to a vector of log likelihood of size C.
Say the correct class is #i, so we hope to the model gives a large likelihood for the i-th element.
Denote the score at the i-th element of the output vector (prediction) is xi.
xi the bigger the better.
So we define loss as negative xi, (loss is the smaller the better)
for one prediction, loss = -xi
for one batch, loss = sum(-xi), or loss = mean(-xi), for all predictions.
CrossEntropyLoss combines LogSoftmax and NLLLoss in one single class.
So if you prefer not to have a logsoftmax layer within your model, use corssentopyloss instead
They are the same thing.