> most people won't ever run local inference because it sucks and is a resource hog most can...

otabdeveloper4 • yesterday at 12:16 PM • 3 replies • view on HN

> most people won't ever run local inference because it sucks and is a resource hog most can't afford

a) Local inference for chats sucks. Using LLMs for chatting is stupid though.

b) Local inference is cheap if you're not selling a general-purpose chatbot.

There's lots of fun stuff you can get with a local LLM that previously wasn't economically possible.

Two big ones are gaming (for example, text adventure games or complex board games like Magic the Gathering) and office automation (word processors, excel tables).

Replies

data-ottawa • yesterday at 1:56 PM

It surprises me that semantic search never gets mentioned here.

If you can use the NPU to process embeddings quickly, you get some incredible functionality — from photo search by subject to near match email search.

For consumer applications that’s what I’m most excited for. It takes something that used to require large teams, data, and bespoke models into commodity that any app can use.

g947o • yesterday at 12:36 PM

> There's lots of fun stuff

Ask your friends or a small business owner if they are going to spend $1k on a new laptop because "there's lots of fun stuff".

For office automation, you'll get a lot more mileage with Claude and similar.

➕ show 1 reply

BoredomIsFun • yesterday at 12:25 PM

> Local inference for chats sucks.

/r/SillyTavernAI would disagree with you.

➕ show 1 reply

alt Hacker News

Replies