alt
Hacker News
steve-atx-7600
•
today at 1:23 PM
•
1 reply
•
view on HN
Inference from an LLM is O(tokens^2)
Replies
halJordan
•
today at 3:53 PM
Only in the naive implementations of attention
Only in the naive implementations of attention