Extremely helpful, thanks! I think I'l go the OpenRouter route for a while to explore various m...

epistasis • yesterday at 9:16 PM • 1 reply • view on HN

Extremely helpful, thanks! I think I'l go the OpenRouter route for a while to explore various models, then weigh the option.

I do kind of like basing decisions somewhat on the API costs, because they reveal what the true costs will be after the eventual rug-pull on subscription pricing.

Even seeing the API costs of Claude Code today to a year ago are pretty eye-watering. I think there's a ton of room, at least for my workflows, to go back to far less capable models.

I've run local models in the past a bit, and explored LLM ops somewhat, and have zero desire to do it anymore, haha. It's fun as a hobby, but there's tons of other homelab stuff for me to play with.

Replies

Imustaskforhelp • yesterday at 9:43 PM

Yeah I think that the API route model is good and it is at cost as it gets and there are some efficiencies which can be gotten from say how deepseek does its inference but at the moment as it stands, API prices are the most stable thing to go through and I wish you luck!

> I've run local models in the past a bit, and explored LLM ops somewhat, and have zero desire to do it anymore, haha. It's fun as a hobby, but there's tons of other homelab stuff for me to play with.

True. I personally haven't played enough because of my hardware being quite modest than even personal hardware recommendations but I have had sometime playing with 350 (M with million!) models like the recent LFM model and very small qwen models. They are just experiments though but I would one day like to see even more standardized models that we could use on our laptops or desktops themselves.

> Even seeing the API costs of Claude Code today to a year ago are pretty eye-watering. I think there's a ton of room, at least for my workflows, to go back to far less capable models.

Yeah exactly. I would constitute that even by using GLM 5.2 as you are originally doing even with API costs is probably much more sustainable over long run as you are currently doing. And it keeps you away from the problems of proprietary models and issues surrounding that.

alt Hacker News

Replies