logoalt Hacker News

mips_avataryesterday at 8:30 PM2 repliesview on HN

So it's basically just openrouter with cloudflare argo networking? I feel like they could do some much more interesting stuff with their replicate acquisition. Application specific RL is getting so good but there's no good way to deploy these models in a scalable way. Even the providers like fireworks which claim to let you deploy LORAs in a scalable way can't do it. For now I literally have to host base load on my application on a rack of 3090s in my garage which seems silly but it saves me $1k a month.


Replies

jonfromsftoday at 3:30 AM

Gilfoyle? Is that you?

show 1 reply
vladguryesterday at 10:00 PM

Curious which models are you able to run and how many 3090s do they require at scale?

show 1 reply