logoalt Hacker News

NooneAtAll3yesterday at 12:23 PM1 replyview on HN

I'd love to see unique-word stats as well


Replies

Imustaskforhelpyesterday at 7:00 PM

Hey website creator here! Can you elaborate? Do you mean like going through only the specific unique-words?

That would be interesting. This might be more helpful to the people who are interested in finding people's the unique grammatical words they used

So do note that this comment is written by me (a human hi!:D) but the following sql query isn't.

SELECT by AS username, sum(length(splitByWhitespace(text))) AS total_words, -- Extract words, clean punctuation, and count distinct values uniqExact( arrayJoin( arrayFilter( x -> x != '', arrayMap(x -> lower(replaceRegexpAll(x, '[^a-zA-Z]', '')), splitByWhitespace(text)) ) ) ) AS unique_words, -- Calculate diversity: What percentage of their vocabulary is unique round((unique_words / total_words) * 100, 2) AS diversity_score FROM hackernews_history WHERE type = 'comment' AND deleted = 0 AND lower(by) = lower('NooneAtAll3') GROUP BY by

№ username total_words unique_words diversity_score1

NooneAtAll3 942372 4752 0.5

Hope it helps :D Have a nice day (still written by human, alright I am going to sleep right now. Had a lot of fun today with this posts/running random sql queries :D)

Good night! This might be my last comment today before sleep! I will be busy tomorrow so I might not be able to see any interesting ideas that people might have here to run it.