logoalt Hacker News

embedding-shapeyesterday at 6:06 PM3 repliesview on HN

But why would I want to results to be done faster but less reliable, vs slower and more reliable? Feels like the sort of thing you'd favor accuracy over speed, otherwise you're just degrading the quality control?


Replies

CamouflagedKiwiyesterday at 11:12 PM

It's not that you want it to be faster, but you want the latency to be predictable and reliable, which is much more the case for local inference than sending it away over a network (and especially to the current set of frontier model providers who don't exactly have standout reliability numbers).

bigyabaiyesterday at 6:16 PM

The high-nines of fruit organization are usually not worth running a 400 billion parameter model to catch the last 3 fruit.

0cf8612b2e1eyesterday at 6:56 PM

Local, offline system you control is worth a lot. Introducing an external dependency guarantees you will have downtime outside of your control.

show 1 reply