logoalt Hacker News

grumpoholictoday at 1:01 PM0 repliesview on HN

With speculative decoding you can use more models to speed up the generation however.