logoalt Hacker News

refulgentistoday at 12:41 AM1 replyview on HN

That math is for comparing all n-grams for all n <= N simultaneously, which isn't what was being discussed.

For any fixed n-gram size, the complexity is still O(N^2), same as standard attention.


Replies

measurablefunctoday at 1:39 AM

I was talking about all n-gram comparisons.

show 2 replies