alt
Hacker News
byzantinegene
•
today at 9:44 AM
•
0 replies
•
view on HN
we're already doing that, it's called distillation and how models like deepseek are trained.