Super low latency inference might be helpful in applications like quant trading. However, in an era ...

TensorToad • yesterday at 9:58 PM • 1 reply • view on HN

Super low latency inference might be helpful in applications like quant trading. However, in an era where a frontier model becomes outdated after 6 months, I wonder how useful it can be.

Replies

TensorToad • yesterday at 10:08 PM

Also, quant trading probably care more about embedding the content instead of generating output tokens

alt Hacker News

Replies