> I wonder if there is a more general solution that can make models spend more compute on making ...

olejorgenb • yesterday at 9:47 PM • 0 replies • view on HN

> I wonder if there is a more general solution that can make models spend more compute on making important choices, while making generation of the "obvious" tokens cheaper and faster.

I think speculative decoding count as a (perhaps crude) way implementing this?

alt Hacker News