Application-specific AI models can be much smaller and faster than the general purpose, do-everythin...

Aurornis • yesterday at 10:36 PM • 1 reply • view on HN

Application-specific AI models can be much smaller and faster than the general purpose, do-everything LLM models. This allows them to run locally.

They can also be made to be deterministic. Some extra care is required to avoid computation paths that lead to numerical differences on different machines, but this can be accomplished reliably with small models that use integer math and use kernels that follow a specific order of operations. You get a lot more freedom to do these things on the small, application-specific models than you do when you're trying to run a big LLM across different GPU implementations in floating point.

Replies

soraminazuki • today at 3:08 AM

> They can also be made to be deterministic.

Yeah, in the same way how pseudo-random number generators are "deterministic." They generate the exact same sequence of numbers every time given the seeds are the same!

But that's not the "determinism" people are referring to when they say LLMs aren't deterministic.

alt Hacker News

Replies