> You have fallen headfirst into the "Not now, so never" fallacy.
Perhaps. Though we have empirical evidence of how much we can quantize and distillate models to the point of practical uselessness. That sets a bar for how large a local model needs to be for general-use as to compete with the could ones. We are talking in the area of 60GB for GPT-OSS/Qwen3.5, which is what enthusiasts are running on 32GB DDR5 + 24GB VRAM RTX 3090.
> As if consumer hardware won't get more powerful
Now I will let you, with that last fact in hand, plot a chart of how much it's been costing to provision that over the past 2 years and use it to prove me wrong about the affordability of local models.