This is the rugpull that is starting to push me to reconsider my use of Claude subscriptions. The "free ride" part of this being funded as a loss leader is coming to a close. While we break away from Claude, my hope is that I can continue to send simple problems to very smart local llms (qwen 3.6, I see you) and reserve Claude for purely extreme problems appropriate for it's extreme price.
I think an LLM that is a decent chunk smarter/better than other LLM's ought to be able to charge a premium perhaps 10x or 100x it's competitors.
See for example the price difference between taking a taxi and taking the bus, or between hiring a real lawyer Vs your friend at the bar who will give his uninformed opinion for a beer.
For now you can run /model claude-opus-4-6
Quality of answers from quantized models is noticeable worse than using the full model.
You'll be better using Qwen 3.6 Plus through Alibaba coding plan.
> This is the rugpull that is starting to push me to reconsider my use of Claude subscriptions.
I'm still with them cause the model is good, but yes, I'm noticing my limits burning up somewhat faster on the 100 USD tier, I bet the 20 USD tier is even more useless.
I wouldn't call it a rugpull, since it seems like there might be good technical reasons for the change, but at the same time we won't know for sure if they won't COMMUNICATE that to us. I feel like what's missing is a technical blog post that tells uz more about the change and the tokenizer, although I fear that this won't be done due to wanting to keep "trade secrets" or whatever (the unfortunate consequence of which is making the community feel like they're being rugpulled).