logoalt Hacker News

programjamesyesterday at 8:24 PM0 repliesview on HN

Don't they add a KL loss term to the frozen model's outputs?