Tokens and speed are a factor but does it require less back and forth to get things right? Being "fast and cheap but wrong" still has a cost that an otherwise "expensive and slow" exchange does not
In my experience it spends a lot more tokens to do things. I wrote a tiny extension for omp that counts the number of "Actually" in the response, and if it exceeds a threshold stops execution and waits for me to tell it what to do. Even then it frequently just ignores basic instructions like "only write boilerplate, I will fill in the functionality"
Imo MiniMax and MiMo are a lot more reliable (and cheap)
Not opus level, but close enough and cheap enough to get the job done
In my experience it spends a lot more tokens to do things. I wrote a tiny extension for omp that counts the number of "Actually" in the response, and if it exceeds a threshold stops execution and waits for me to tell it what to do. Even then it frequently just ignores basic instructions like "only write boilerplate, I will fill in the functionality"
Imo MiniMax and MiMo are a lot more reliable (and cheap)
Not opus level, but close enough and cheap enough to get the job done