logoalt Hacker News

adebayojyesterday at 12:40 PM1 replyview on HN

Great questions. We weren't quite explicit about the training data attribution process. We'll discuss this in more detail in future work. We can track down which parts of the training data were interpolated to create that sentence. For those training data sentences, we then compare the concepts between generated and training.

We can attribute to exact sentences and chunks in the training data. For the first release, we are sharing only concept similarities. Over the coming weeks, we'll share and discuss how you can actually map to the exact training sentence and chunk with the model.

For a technical overview of how some of these models work, check this link out: https://www.guidelabs.ai/post/prism/


Replies

xmcqdpt2yesterday at 1:29 PM

That would be great because "I got it from Wikipedia and Arxiv" isn't exactly useful.

From reading your second link (and please tell me if I got it wrong) it sounds like it isn't actually tracking to training data but to prototypes which are then linked a posteriori to likely sections of the training data. The attribution isn't exact, right? It's more like "these are the likely texts that contributed to one of those prototypes that produced the final answer." Specifically the bit in PRISM titled "Nearest neighbour Search" sounds like you could have a prototype that takes from 1000 sources but 3 of them more than the others, so the model identify those 3, but the other ones might matter just as much in aggregate?

It says that the decomposition is linear. Can you remove a given prototype and infer again without it? That would be really cool.

show 1 reply