logoalt Hacker News

thedevilslawyeryesterday at 1:47 PM1 replyview on HN

Interesting. Some questions: Would you say polish is more dense or less dense than english? It's interesting to hear that code quality is not suffering but the response text is sillier or blunter. Any other descrepenacies compared to English?


Replies

pyonpyonyesterday at 1:54 PM

I would say it certainly can be more dense but even if it's more dense, the tokenizers count it as more. Last time I checked in OpenAI tokenizer for my agents.md it ate 30/40%~ more tokens than the English version at roughly 1:1 meaning.