logoalt Hacker News

anemlltoday at 4:23 AM1 replyview on HN

What hardware are you on? Most models are memory bandwidth limited. ANE was limited to 64GB/s prior to M3 Max or M4 pro. If you are on M1, GPU will be significantly faster for 3-8B models due to memory bandwidth rather then ANE capabilities.


Replies

SparkyMcUnicorntoday at 4:36 AM

M4 Max with 128GB of memory.