logoalt Hacker News

Readeriumtoday at 5:46 AM3 repliesview on HN

LLMs are memory bandwidth bound not compute bound.


Replies

ondratoday at 6:38 AM

This is incorrect, prompt processing is compute bound.

AntiUSAbahtoday at 8:15 AM

LLMs are bound by both and depends on the hardware which factor is higher.

icelancertoday at 7:40 AM

This is only true for some parts of the time cost function.