As another user suggested, train on the corpus that ends with the white paper publication.

SJMG • today at 1:40 AM • 2 replies • view on HN

Replies

That’s not feasible. Apparently only SOTA models present this behavior. Having cutoff date at paper publication significantly hinders its capabilities. Besides that, try to convince anyone to spend millions upon millions of dollars to train a model with primary goal of possibly being able to deanonymize one person.

smeej • today at 4:36 AM

But then compare it to the corpus of any of the suspects since the whitepaper publication.

It's one thing to sound like Satoshi before the whitepaper, but does anyone still sound like Satoshi?

alt Hacker News

Replies