> it can take anywhere between 6-25 seconds for a response (after lots of thinking) from me asking "Hello world".
Qwen thinking likes to second-guess itself a LOT when faced with simple/vague prompts like that. (I'll answer it this way. Generating output. Wait, I'll answer it that way. Generating output. Wait, I'll answer it this way... lather, rinse, repeat.) I suppose this is their version of "super smart fancy thinking mode". Try something more complex instead.
OK thanks! That's helpful. I ignorantly assumed simpler prompt == faster first response.
Indeed. Qwen doesn’t just second guess itself, it third and fourth guesses itself.