When you're picking most likely tokens, you get least surprising tokens, ones with least entropy and least information per token.