I tried Gemini 2.5 Deep Think, was not very impressed ... too much hallucinations. In comparison GPT 5.2 extended time hallucinates at like <25% of the time and if you ask another copy to proofread it goes even lower.
I never tried 2.5. Three is pretty solid though, at least for my use case.
If there's a specific query you want me to run through it for comparison I'm happy to give it a go.
I never tried 2.5. Three is pretty solid though, at least for my use case.
If there's a specific query you want me to run through it for comparison I'm happy to give it a go.