Do you have any reasons to believe that granite is more immune to the effects of quantization than other tiny models? Otherwise it seems odd to judge a tiny model true capabilities by using its 4bit quant.
This model is small enough that it might be sensible to try the same prompts against all of the quant sizes to try and spot any differences.
This model is small enough that it might be sensible to try the same prompts against all of the quant sizes to try and spot any differences.