> You probably want to try renting some time on a dedicated box with roughly the specs you’re considering and running the open models
You don't even need to go that far. For example, with Exoscale Dedicated Inference[1] you just point it at the Hugging Face for the model and quantisation you want to test and it automagically spits out an OpenAI-compatible API endpoint.
[1] https://www.exoscale.com/ai-cloud-infrastructure/dedicated-i...
(I have no relationship with Exoscale, this particular product just crossed my radar recently)
I think they're just suggesting renting as a way to test that the hardware they're considering purchasing would actually be able to do what they need.