logoalt Hacker News

throwawayffffastoday at 7:17 PM0 repliesview on HN

He meant prompt eval time, but have a look at these guys: https://www.youtube.com/watch?v=ndSA9T5yvmM

Over 2500 tokens per second on a single request. With 8 MI300X.