Wondering about Google Multi-Token prediction, why isn't this being implemented into every new ...

maxiniol • yesterday at 11:30 PM • 1 reply • view on HN

Wondering about Google Multi-Token prediction, why isn't this being implemented into every new major model ? Is the 750 token/s achieved using this technique ?

Replies

adam_arthur • yesterday at 11:37 PM

MTP or similar probably is being used on the backend, but that's transparent to the end user

alt Hacker News

Replies