logoalt Hacker News

Havoctoday at 8:20 AM0 repliesview on HN

You can also try speculative decoding with the E2B model. Under some conditions it can result in a decent speed up