Everytime I've tried a local model, and I have tried lots for a couple years now, they just seem like they were overtrained on benchmarks. They consistently perform dramatically worse than even older models from Anthropic/OAI/Google.
You're just using them wrong.
You're just using them wrong.