This is why, despite enjoying all of this, I really want to focus on locally hosted models. If we don't host the technology ourselves, we're setting ourselves up for a hard fall down the line.
Until very recently, local models been little more than brittle toys in my experience, if you're trying to use them for coding.
But lately I've been running Pi (minimal coding agent harness) with Gemma4 and Qwen3.6 and I've been blown away by how capable and fast they are compared to other models of their size. (I'm using the biggest that can fit into 24gb, not the smaller ones.) In fact, I don't really need to reach for Claude and friends much of the time (for my use cases at least).