alt
Hacker News
grumpoholic
•
today at 1:01 PM
•
0 replies
•
view on HN
With speculative decoding you can use more models to speed up the generation however.