logoalt Hacker News

fulafelyesterday at 7:22 PM1 replyview on HN

So it's not measuring output tokens/s, just how long it takes to start generating tokens. Seems we'll have to wait for independent benchmarks to get useful numbers.


Replies

dotancohenyesterday at 9:32 PM

For many workflows involving real time human interaction, such as voice assistant, this is the most important metric. Very few tasks are as sensitive to quality, once a certain response quality threshold has been achieved, as is the software planning and writing tasks that most HN readers are likely familiar.

show 1 reply