What size model are you quantizing and comparing? The interesting thing about quantization, is how the larger the number of parameters, the less of a difference it makes to quantize the weights, even to an extreme degree when working with the largest parameter models. For small models is can be a disaster though.