logoalt Hacker News

_0ffhtoday at 10:28 AM0 repliesview on HN

Lookahead Sparse Attention should be playing a big role as well, as it dramatically slashes memory consumption.