Topics
If we apply log
and softmax
separately, when the output of softmax becomes very close to zero, then log would yield negative infinity.
x = torch.tensor([-500.0, 0])
torch.log(torch.softmax(x, dim=0)) # tensor([-inf, 0.])
torch.log_softmax(x, dim=0) # tensor([-500.0, 0.])
The numerical instability stems from the log and exp operations done separately:
torch.log(torch.exp(x)) # tensor([-inf, 0.])
In above example, we expect log and exp to cancel each other out and get x
, but we actually get [-inf, 0.0]
.