I tried it, it was cool. I don't like nully's attitude though. Very dismissive and tough.
But I like your setup as a whole. I'll see if I can get some takeaways from it.
I do tiered here too, with the lowest tier just a qwen local bot.
By the way how do you handle the escalation from haiku to opus I wonder?
An error occurred. Try again.
But seriously, OP should somehow change this message to something like "Too many people are chatting right now, please try again in a moment."
(that would be even more appealing to recruiters)
I run an agent and borrow inspiration from what claude code used to do with "think hard" -- but instead of increasing the thinking budget, it promotes the request from Haiku to Opus
It's not very natural though. Curious what other people are doing as well