Yeah. There's no way to verify what these providers are doing. The real future is running these models at home. Opus level inference on our own hardware would be a dream come true.
How will anyone running home instances be able to compete against people paying some money running much more powerful models on much more powerful hardware?
I dream of having an LLM in a box over usb bought off AliExpress for a year and change now.
The LLM in a box is something you can buy today, but it 1. doesn’t serve over usb by default 2. costs $100k for hardware (not counting electricity) at 100 tps 3. can’t buy this from AliExpress.
Better to put that $100k in t-bills and just buy tokens even at api prices.