alt
Hacker News
yieldcrv
•
today at 4:32 AM
•
0 replies
•
view on HN
impressive, we can get high tokens/s with 8B param models and doubling it with MTP