The biggest challenge I have with local models right now (and I use them extensively) is search integration and tool calling. The thing that Claude and ChatGPT get right for most general purpose use cases which is hard to do with a local model is the model deciding when to search vs use its built-in training, and having strong search tooling, as well as tool calling for additional data sources via MCP. If you can incorporate the right data into the context window, local models are more than good enough for general purpose usage as they stand today. Qwen 3.5, Gemma 4, even gpt-oss-120b are solid at reasonable quants if they have the right data.
The moment we see standardized and batteries-included pathways to integrate search, ideally at no additional cost, in things like LM Studio combined with better tool calling in the local models, you'll quickly see local model performance catch up.