logoalt Hacker News

afro88yesterday at 11:16 AM1 replyview on HN

What models and inference provider?


Replies

h14hyesterday at 3:51 PM

I'm using kimi-k2-instruct as the primary model and building out tool calls that use gpt-oss-120b to allow it to opt-in to reasoning capabilities.

Using Vultr for the VPS hosting, as well as their inference product which AFAIK is by far the cheapest option for hosting models of these class ($10/mo for 50M tokens, and $0.20/M tokens after that). They also offer Vector Storage as part of their inference subscription which makes it very convenient to get inference + durable memory & RAG w/ a single API key.

Their inference product is currently in beta, so not sure whether the price will stay this low for the long haul.

show 1 reply