A discussion on how to avoid paying the price of running an expensive model is not about the expensive model. You can triage things running a cheap model with Ollama. Heck, throw in gpt4.1 which is free.
I don’t think triaging is necessarily an easy task
I don’t think triaging is necessarily an easy task