logoalt Hacker News

eurekinyesterday at 8:22 PM0 repliesview on HN

Batching lowers that, since the model is read once from memory. Activation accumulation doesn't scale as nicely