logoalt Hacker News

epolanskiyesterday at 2:06 PM0 repliesview on HN

The distillation you're talking about is about cutting the number of weights, it has nothing to do with extracting QAs from another model.