it would be a really great option if it didn't lack vision
what do you use vision for? I have failed to find a workflow with it that makes sense, asking it to review screenshots of websites or whatever it misses extremely obvious details like text flowing out of it's container/overlapping other text, things being in entirely the wrong place, etc.
For coding?
this is mcp or custom call to lowest cost model
someone did a webcam + agentic + capture of other computer bios/boot -> upload to image model -> back to agent