logoalt Hacker News

xmcqdpt2yesterday at 1:29 PM1 replyview on HN

That would be great because "I got it from Wikipedia and Arxiv" isn't exactly useful.

From reading your second link (and please tell me if I got it wrong) it sounds like it isn't actually tracking to training data but to prototypes which are then linked a posteriori to likely sections of the training data. The attribution isn't exact, right? It's more like "these are the likely texts that contributed to one of those prototypes that produced the final answer." Specifically the bit in PRISM titled "Nearest neighbour Search" sounds like you could have a prototype that takes from 1000 sources but 3 of them more than the others, so the model identify those 3, but the other ones might matter just as much in aggregate?

It says that the decomposition is linear. Can you remove a given prototype and infer again without it? That would be really cool.


Replies

adebayojyesterday at 2:03 PM

This part of the claim is involved, so we have future posts to clarify this. And yes, you can remove a prototype and generate again. We show examples in that prism post.

In prism, for any token the model generates, you can say, it generated this token based on these sources. During training, the model is 'forced' to match all the prototypes to specific tokens (or group of tokens) in the data. The prototype itself can actually be exactly match to a training data point. Think of it like clustering, the prototype is a stand-in for training data that looks like that prototype, we force (and know) how much the model will rely on that prototype for any token the model generates.

The demo in the post is not as granular because we don't want to overwhelm folks. We'll show granular attribution in the future.