logoalt Hacker News

gn_centralyesterday at 3:15 PM1 replyview on HN

Curious if this similarity comes more from the training data or the model architecture itself. Did they look into that?


Replies

OtherShrezzingyesterday at 3:21 PM

They describe that both are important, and researched in the paper, within the opening paragraph.