Topics

In 1-bit LLMs, weights are represented in binary (or ternary in the case of BitNet b1.58). This makes the matrix multiplication very simple.

W = [1, -1, 0, 1]
X = [a, b, c, d]

W@X (dot prod) = a - b + d

In above example, with ternary representation [1, -1, 0, 1] of weights, we got rid of multiplication. Special hardware can be designed for this operation that can make processing very fast.