> And to add insult to injury, some providers will ride on the good reputation of some local model, selling you a terrible quant instead.
I just started using OpenRouter for some control testing of local models and what surprises me the most isn't that there are different providers providing different quantization levels, that makes sense, but I can't seemingly find a way of seeing what provider+model+quantization is actually used?! https://openrouter.ai/models shows the models, then say https://openrouter.ai/moonshotai/kimi-k2.7-code shows the providers but when I go to https://openrouter.ai/moonshotai/kimi-k2.7-code?endpoint=e7a... for example, why on earth is it not showing the actual details about the actual weights they're serving?! Give me details! It does have a "Precision" value that is sometimes filled out, but that seems to be a guess at best, even providers with the same values there have wildly different quality responses.
I like the idea about OpenRouter but holy hell does the implementation seem very far off from what it needs to be, in order to be useful.
There are properties on the API call you can pass for specific providers, so you test which providers you like the output, then add them to the list in ranked order if you want one by default, then to fall back to the other.
There might be something in the response, or in a followup API call for the session, that you get better details. I think I've seen the details in the dashboard, so they do exist.