logoalt Hacker News

causalyesterday at 3:39 PM2 repliesview on HN

You seem to be going off the title which is plainly incorrect and not what the paper says. The paper demonstrates HOW different models can learn similar representations due to "data, architecture, optimizer, and tokenizer".

"How Different Language Models Learn Similar Number Representations" (actual title) is distinctly different from "Different Language Models Learn Similar Number Representations" - the latter implying some immutable law of the universe.


Replies

dnauticsyesterday at 5:12 PM

> latter implying some immutable law of the universe

I think the implications is slightly weaker -- it implies some immutable law of training datasets?

NooneAtAll3yesterday at 7:58 PM

I don't understand your argument

"How X happens" still implies that X happens, just adds additional explanation on top

show 1 reply