Altamash Khan

Did the war forge the spear that remained? No. All it did was identify the spear that wouldn't break

Did the war forge the spear that remained? No. All it did was identify the spear that wouldn't break

no non-linearity is used in word2vec

Oct 22, 20241 min read

Topics

word2vec

Mainly because the network is shallow with just embedding or linear layers, so no need to use non-linearity. Also the loss function adds some non-linearity to the logits (e.g., sigmoid for skip-gram with negative sampling, softmax for cross-entropy).