I dont think thats plausible because they also just launched a high-speed variant which presumably h...

irthomasthomas • yesterday at 6:25 PM • 0 replies • view on HN

I dont think thats plausible because they also just launched a high-speed variant which presumably has the inference optimization and smaller batching and costs about 10x

also, if you have inference optimizations why not apply them to all models?

alt Hacker News