There's a whole universe of tasks that aren't "fix a Github issue" or even related to coding in the slightest. A large number of those tasks doesn't necessarily get better with model updates. In many cases, the performance is similar but with different behavior so you have to rewrite prompts to get the same. In some cases the performance is just worse. Model updates usually only really guarantee to be better at coding, and maybe image understanding.