When I reviewed the conversations affected by this issue, they did not always align with my feeling ...

m3h • today at 4:45 PM • 1 reply • view on HN

When I reviewed the conversations affected by this issue, they did not always align with my feeling of "degraded output".

Some were definitely below par, and I recall having to iterate on the generated code more than I wanted to. However, it is only true for a very small number of conversations.

So we're looking at a small set of affected conversations, and even within that small set, only a few will have degraded output, likely because the model can compensate for the reasoning defect over the long conversation.

Replies

nsingh2 • today at 5:16 PM

I think it might affect real work if part of it requires a lot of thinking, i.e. something similar in nature to a puzzle.

There seems to be something wrong with the "commentary" channel related intermediate updates, maybe the model gets confused about what's an intermediate update vs what's the final answer? [1]

[1] https://github.com/openai/codex/issues/30364#issuecomment-48...

alt Hacker News

Replies