Curious if this similarity comes more from the training data or the model architecture itself. Did t...

gn_central • yesterday at 3:15 PM • 1 reply • view on HN

Curious if this similarity comes more from the training data or the model architecture itself. Did they look into that?

OtherShrezzing • yesterday at 3:21 PM

They describe that both are important, and researched in the paper, within the opening paragraph.

alt Hacker News