logoalt Hacker News

cold_harbortoday at 11:38 AM0 repliesview on HN

LoRA won't fix the tokenization problem. Norwegian on a typical English-heavy BPE vocab uses 1.5-2x more tokens per word — that compounds into real inference cost, not just quality