Topics
If we have some output predictions (logits) from a model for a given classification task with C
classes, we can evaluate against the target class id using NLL loss.
logits = torch.tensor([0.2, 2.0, 1.4])
target_class_id = torch.tensor(1)
Here we have 3 classes, so 3 logit values and our target is class 1 (classes are 0-indexed). The logit at index 1 is 2.0
.
Caveat
The standard
nn.NLLLoss()
in PyTorch does indeed expect integer targets, not continuous probabilities. It’s designed for hard classification where each sample belongs to exactly one class.
Step 1: Convert logits to probabilities
probs = F.softmax(logits, dim=0)
We see that our target probability is 0.5834
. Ideally we want our probability for the target class to be equal to 1.0
. So we need to measure how bad this 0.5834
value is. Taking the log of this basically gives us the log likelihood
Step 2: Getting the log likelihoods
ll = torch.log(probs)
Step 3: Calculate the loss
The loss is nothing but the negative of the value at target_class_id
position, i.e. 0.5389
in this case. This is basically what F.nll_loss(...)
does and hence we have the name “negative log likelihood” loss.
val_1 = -1.0 * ll.gather(dim=0, index=target_class_id)
val_2 = F.nll_loss(ll, target_class_id)
print(val_1, val_2) # tensor(0.5389) tensor(0.5389)
torch.isclose(val_1, val_2) # torch.tensor(True)