logoalt Hacker News

michaeld123yesterday at 2:12 PM0 repliesview on HN

It's not so much the prompt, as the volume. This overall project has involved >100M LLM inferences, spread across 1.9M headwords. the building block is "what words or short terms are related to X?", but scaled out. Plus a lot of filtering. So it's mostly a reflection of English, and also a reflection of what ChatGPT and Claude report back as a significant collocation.