That’s false. Larger LLMs learn token decompositions through their training, and in fact modern trai...

p-e-w • today at 8:35 AM • 1 reply • view on HN

That’s false. Larger LLMs learn token decompositions through their training, and in fact modern training pipelines are designed to occasionally produce uncommon tokenizations (including splitting words into individual characters) for this reason. Frontier models have no trouble spelling words even without tools. Even many mid-sized models can do that.

Replies

kilpikaarna • today at 9:11 AM

Wait, where can I learn more about this? I don't doubt that varying the tokenization during training improves results, but how does/would that enable token introspection?

➕ show 1 reply

alt Hacker News

Replies