Bunch of negative sentiment in here, but I think this is pretty huge. There are quite a lot of appli...

Tehnix • today at 9:08 PM • 0 replies • view on HN

Bunch of negative sentiment in here, but I think this is pretty huge. There are quite a lot of applications where latency is a bigger requirement than the complexity of needing the latest model out there. Anywhere you'd wanna turn something qualitative into something quantitative but not make it painfully obvious to a user that you're running an LLM to do this transformation.

As an example, we've been experimenting with letting users search free form text, and using LLMs to turn that into a structured search fitting our setup. The latency on the response from any existing model simply kills this, its too high to be used for something where users are at most used to the delay of a network request + very little.

There are plenty of other usecases like this where.

alt Hacker News