logoalt Hacker News

smusamashahtoday at 5:58 AM1 replyview on HN

Does it mean if it was embedded on a Talaas chip, it could generate ~50,000+ tokens per second?


Replies

Havoctoday at 3:45 PM

Think pretty much anything is going to get a enormous speed boost if the model isn’t undergoing mem latency but is just inherently baked into the circuits asic style