logoalt Hacker News

kliptyesterday at 1:46 AM0 repliesview on HN

> I trained is relatively large, as it's a single model that supports all language pairs (to leverage transfer learning).

Note that you have the larger model, if you wanted a smaller model for just one language pair, I guess you could use distillation?