Topics
To constrain the weights to -1, 0, or +1, in BitNet b1.58 we adopt an absmean quantization
function. It first scales the weight matrix by its average absolute value, and then round each value to the nearest integer among {-1, 0, +1}:
Did the war forge the spear that remained? No. All it did was identify the spear that wouldn't break
Topics
To constrain the weights to -1, 0, or +1, in BitNet b1.58 we adopt an absmean quantization
function. It first scales the weight matrix by its average absolute value, and then round each value to the nearest integer among {-1, 0, +1}: