logoalt Hacker News

inkysigmatoday at 9:20 AM0 repliesview on HN

At a high level, the text samples are how the relationships are derived. If we treat text samples as sequences of tokens, then the sequences of tokens describe the joint distributions they occur together which confers the relationship between them. Iirc, this is related to the idea of the distributional hypothesis in NLP: the idea the semantics of words should be similar if they occur in similar situations.