You can also try speculative decoding with the E2B model. Under some conditions it can result in a d...

Havoc • today at 8:20 AM • 0 replies • view on HN

You can also try speculative decoding with the E2B model. Under some conditions it can result in a decent speed up

alt Hacker News