Inference from an LLM is O(tokens^2) | alt Hacker News

alt Hacker News

steve-atx-7600 • today at 1:23 PM • 1 reply • view on HN

Inference from an LLM is O(tokens^2)

Replies

halJordan • today at 3:53 PM

Only in the naive implementations of attention