I may be wrong, but this is what I figured out. Google provided these quantize-ready models, but they do not come pre-quantized. However, to produce their benchmarks, they quantized their model using the standard quantization approach. Unsloth has an advanced quantization method that performs better than the standard quantization, so the evals are better for unsloth quants.