logoalt Hacker News

jqpabc123today at 12:58 PM3 repliesview on HN

Another possibility not really addressed here --- local LLMs.

AI on hardware you own and control --- instead of a metered service provider. In other words, a repeat of the "personal computing" revolution but this time focused on AI.

TurboQuant could be a key step in this direction.


Replies

zozbot234today at 8:15 PM

TurboQuant helps KV quantization which is not very relevant to local LLMs, since context size becomes most relevant when you run inference with large batches. For small-scale inference, weights dominate. (Even if you stream weights from SSD, you'll want to cache a sizeable fraction to get workable throughput, and that dominates your memory usage.)

schnitzelstoattoday at 1:10 PM

Yeah, I don't think local LLM's will keep up with what the massive corporations put out. But they might get to a level of performance where it just doesn't matter for most users.

And people would prefer to run a model locally for 'free' (not counting the energy cost) rather than paying for an LLM subscription.

netdevphoenixtoday at 1:09 PM

Local LLMs don't sound profitable at all for those building them. If you really wanted a SOTA model, you would be paying eye watering amounts to own it unless you got an open sourced one.

show 1 reply