Topics

In PyTorch, this CrossEntropyLoss with logits output (logits just means no activation applied) technique is really just wrapper code around the older NLLLoss with LogSoftmax technique. When using the newer and simpler approach for multi-class classification, you don’t apply any activation to the output and then CrossEntropyLoss applies LogSoftmax internally. When using the older approach for multi-class classification, you apply LogSoftmax to the output and NLLLoss assumes you’ve done so.

During prediction with the CrossEntropyLoss technique, the raw output values will be logits so if you want to view probabilities you must apply SoftMax. With the older NLLLoss technique, the raw output values will be a LogSoftMax so if you want to view probabilities you must apply the exp() function (to negate the log).