Topics
In 1-bit LLMs, weights are represented in binary (or ternary in the case of BitNet b1.58). This makes the matrix multiplication very simple.
In above example, with ternary representation [-1, 0, 1]
of weights, we got rid of multiplication. Special hardware can be designed for this operation that can make processing very fast.