He meant prompt eval time, but have a look at these guys: https://www.youtube.com/watch?v=ndSA9T5yvmM
Over 2500 tokens per second on a single request. With 8 MI300X.