When I reviewed the conversations affected by this issue, they did not always align with my feeling of "degraded output".
Some were definitely below par, and I recall having to iterate on the generated code more than I wanted to. However, it is only true for a very small number of conversations.
So we're looking at a small set of affected conversations, and even within that small set, only a few will have degraded output, likely because the model can compensate for the reasoning defect over the long conversation.
I think it might affect real work if part of it requires a lot of thinking, i.e. something similar in nature to a puzzle.
There seems to be something wrong with the "commentary" channel related intermediate updates, maybe the model gets confused about what's an intermediate update vs what's the final answer? [1]
[1] https://github.com/openai/codex/issues/30364#issuecomment-48...