It’s funny how I spend so much time on HN, yet couldn’t point out a single username (that I don’t know IRL) besides dang.
This is one reason I feel an odd disconnect (anonymity?) with HN that isn’t felt on other social platforms I’ve been a part of. Those often have avatars or some other visual form of recognition that helps put a “face” to a name.
I’m not sure if that’s a good or bad thing, but I definitely think it’s intentional.
Nice SQLi vulnerability you got there ;-)
> making this project was the most fun I have had in some time haha!
> sorryyyyy for vibe coding it though. Peace. I am only human after all […]
Well, yes, of course the whole app was written by an LLM. I’m not surprised at all.
---
Request:
POST /?user=play&add_http_cors_header=1 HTTP/1.1
Host: play.clickhouse.com
Content-Type: text/plain;charset=UTF-8
User-Agent: Mozilla/5.0 (KHTML, like Gecko) Chrome/109.0.5414.120
Accept: */*
Origin: https://serjaimelannister.github.io
Referer: https://serjaimelannister.github.io/
SELECT username, total_words, global_rank, total_active_users,
concat(toString(global_rank), ' / ', toString(total_active_users)) AS placement,
round(100 * (1 - (global_rank / total_active_users)), 2) AS percentile
FROM (
SELECT by AS username, sum(length(splitByWhitespace(text))) AS total_words,
rank() OVER (ORDER BY sum(length(splitByWhitespace(text))) DESC) AS global_rank,
count(*) OVER () AS total_active_users
FROM hackernews_history WHERE type = 'comment' AND deleted = 0 AND notEmpty(by)
GROUP BY by
) WHERE username = '' OR 1=1;--' FORMAT JSON
Response: This message is too large to displayVery cool. I made the top 1,000 too.
It would be interesting to see karma-per-word, as well, as a kind of succinctness density factor. Although karma points are not equivalent to quality, and you’d need to also factor in average comment length and some other things.
To use myself:
31,273 karma / 351,012 words ≈ 0.0891 karma per word
There's something about the numbers I can't figure out. Look at the top three HN contributors by karma[1]:
username words karma
1. tptacek 4,310,896 416351
2. jacquesm 3,841,209 237961
3. ingve 2,273 215283
How did ingve get to #3 with just 2 thousand words, whereas tptacek and jacquesm authored 3-4 million words? Looking at his 14-year history, it's true that he hasn't written that much. I suppose one possibility is that his writing is 1000x better at earning karma. But I'm going to hazard a guess that it's the quality of his 3-4 submissions per day that brings up his karma when one of his submissions is a hit (I think that submissions do count toward karma).I miss DoreenMichele. She always added thoughtful perspectives.
Looks like she’s actively writing at https://califmichele.blogspot.com/ and https://doreenmichele.blogspot.com/ but has departed HN.
I'm also naturally curious about the byte count --- using the accepted standard of 5 for words to characters, and since I almost never post anything but ASCII, I've been writing approximately 1.25KB per day here; or just over 5.5MB worth of text so far. Considering that English text compresses very well, and using ~20% as a rough ratio, this means that all ~1.2M words of my comments here, compressed, would still fit on one 3.5" floppy disk.
Great app, pleased to be in the top 0.38%, but it appears that does not translate to a top 100 spot by an order of magnitude.
It would be nice to have some readership stats, too.
I've been wondering whether Webcam-based eyetracking software could be used to calculate via triangulation/trilateration which word one is reading on the screen.
Then words could be color-coded by impact.
The four most prolific writers are:
1 15.95 dragonwriter
2 14.37 tptacek
3 12.80 jacquesm
4 11.15 dang
Take this (and OP) with a grain of salt, if for no other reasons than it does not account for how long someone has been commenting here.Cool project!
Slight nitpick (in the spirit of HN): Looks like the search is case sensitive when I think HN usernames are not. Only realised when my phone capitalised the first letter and it returned no results, but worked after searching in lowercase.
Goodness me. I'm in the top 2.2% by word count.
I'm not at all sure what I feel about this. On the one hand it's fun to be near the top of some kind of ranking, on the other it suggests that I spend altogether too much time on HN.
"No, I don't think I will" - I already have a sense of how much time I've spent here.
I recognize most of the top 50 usernames but I have no idea who dragonwriter is.
Maybe we just have opposite interests but that was a surprise.
How does it count so fast? Clickhaus preloaded dataset?
Top 0.023%, I was surprised! I usually keep it pretty short here, and my account isn't old.
Very cool. I would point out that the search is case-sensitive, and with that being said I'm not sure if HN usernames are case-sensitive.
Ooh I cracked the top 500. I’m at about 475k words.
Took me a few tries to find my user since I wasn’t expecting the case sensitivity.
Thanks for this. Another book you could add for comparison purposes would be James Joyce’s Ulysses. Or I guess the unabridged The Stand by Stephen King would be good too.
Ooh The Stand (unabridged) is estimated at 473,000 words! I wrote The Stand in comment length. Wow.
Cool! Just a thought: instead of having to query the Clickhouse cluster whenever a client clicks "View Top 1000 Leaderboard" (which could cause a lot of load), it might be useful to instead fetch the top 1000 every hour (day?) and display the top 1000 as a static list.
Top 0.11% / #814 by word count? Did not expect that. I wonder if it’s possible to see trend by year. I hope that’s more from 2022 and earlier
This is pretty cool! This week I was just thinking of vibe coding something with my HN profile as well (e.g, analyze how my writing has changed over the decade-ish of being on here).
Also, 95k words written on here apparently. Cool to know haha.
Rank 774, not bad for a 2021 account I guess. Or not good? Depends on typing 1.1 game of thrones into a message board is good or not.
> 357,191 words
That's like, five novels. What I have been doing here, damn...
So if we find somebody who uses one-word posts like "interesting" on every comment, have we unmasked .. he who mus(k)t not be named?
@dragonwriter was meant for this. Has written ~16 volumes of GOT, +1.5 volume from 2nd.
Aw man, I'm only 112 places away from breaking into the top 1000. Time to go pick some fights...
(I kid, I kid, dang don't hurt me)
Hey Hackernews, You can read my previous comment https://news.ycombinator.com/item?id=46827731#46828331 where I was suddenly writing until I realized that on Hackernews I have written way too many words.
I then got the idea of actually figuring out how many. Then I first wanted to try out algolia but then later, I found out about clickhouse and how it had a play and the api for playing is so simple, I am definitely gonna make more projects on top of clickhouse play for HN (seriously my mind got blown because I was assuming that the browser -> api was gonna be hard but it seriously wasn't)
Then decided to think to write a github page about it for other people as well.
Anyways, this was one of the most fun project I had. So it turns out that I personally have written 0.64 Game of thrones words in Hackernews itself.
Dang has written 11.15 Volumes equivalent to game of thrones which is actually really crazy.
When I searched dang I was shocked haha. Anyways Dang, If you are reading this, I know that we all like to talk about how moderation of HN has issues but seriously man, the amount of efforts you put in is really lovely & respectable. We all love you.
I still feel like there are some issues where people flag anything they dislike which can be frustrating and other things but that still doesn't really impact the moderation and the moderation team (dang) is pretty awesome in my opinion even if the website does have this flaw in my opinion but Hackernews is one of the best websites man!
Dang today's your day! We can discuss the issues of flagging and others some other day, Have a nice day now!
(Also a little side fact but I picked game of thrones because my name of github is SerJaimeLannister because I was watching game of thrones in my brother's dorm room once in his college room and I literally just thought one or two episodes and started watching from s4 or something and then literally the second I got home, I binge watched Game of thrones till end and then s1 s2 but I think that I haven't watched some seasons I think s3 iirc more but still I loved the show so much and I think I had lost my old github account and naming is always hard especially in programming so picked SerJaimeLannister but this is the reason why I picked the novel equivalent to be game of thrones!)
> Top 0.41%
If only any of that was useful!
On a side note though there is (maybe intentional) case sensitivity? Can't remember how hn usernames work.
I'm in the top 1.5%, even though I hardly have written anything here, and the comments are full of similar anecdotes. I guess there's a _ton_ of people lurking, and the active community is actually quite small. I find that quite surprising.
Looking at the top 1000 I'm surprised there's no power law. It's just a lot of people with generally similar number of words.
So many of these names I feel I know them, but I don't know them, personally.
I know them, by tone. I read his/her take on the topic. Turns out you don't need to see any faces or body ratios of any kind to connect with people.
Thanks for keeping HN 'stable/sane'!
I did rally simple frequency analysis based on corpus source a while ago and the results were super clear, you can tell a corpus by its frequency fingerprint. I wonder if something similar to this could fingerprint bot accounts?
Global rank of 1832, word count of 197,292, top 0.24% percentile. Karma/word comes out at ~0.0372...
Ouch. Feels like I need to spend more time elsewhere.
Huh. In the top 1500, with approximately one GoT worth of text in ~17 years.
Also, I recognize four of the top five users as prolific commenters, but dragonwriter doesn’t ring a bell at all. Maybe they frequent all the threads that I don’t.
i've written a novel's worth of words. yikes! ps, game of thrones is not a good comparison. a literary agent is more likely to take your novel seriously if you have more than 50k words
for an account i created in june 2024, top %0.54 is a lot. I need to spend less time on HN. more than that, I need stop typing walls of text, has to be annoying to readers! :)
Pretty amazed to be on the list at all.
Surprised that with ~6k words I am already in the top ~5%. I guess the old 90-9-1 rule roughly holds up.
Global Rank 7089 | World Count 62,677 | Percentile Top 0.92% | Game of Thrones Volume 0.21
This would be pretty cool for other sites. My Reddit stats are probably way worse.
Dear reader this person now have all your alt accounts linked to you?
Oh my.
> Global Rank > 385 / 774235
> Word Count > 509,412
> Top 0.05%
I don't know if I'm too long-winded or I comment too much or both. Good to know I'm in the top 400 regardless.
I feel like a perfect realization Goodhart's Law is about to happen to move up our rankings.
Top 1.55% with ~36000 words. I can be quite chatty in my comments, it seems.
Heh. Here's a thread where the most verbose commenters come and write even more. I haven't written nearly as much as I thought: 2,410th out of 774,235 users, 159,634 words, Top 0.31%.
A few years ago, I exported my HN and reddit comments along with my personal blog and private notes into a SQLite database. It was millions of words. I had a vague plan of pulling out long, insightful bits and editing them together into a book of essays. I also thought it would be cool to be able to look up my previous thoughts on a topic. Neither ended up happening.
I've been meaning to do the same thing to train an LLM, but I'm not sure I particularly need a digital version of me. Though it would be interesting to ask it to write a book for me in my own style.
In theory, it'd be the best book I have ever read.
It would be fascinating to see a word to karma ratio. (Mine would be incredibly low).
I like the game of thrones conversion.
Look on my [prolix] words, ye Mighty, and despair!
Click [here] to train a 6B model with just your words ...
Legend has it only a dragon writer could defeat tptacek on Hacker News.
Also I find it kind of weird that the search box is case-sensitive. HN itself preserves capitalization when rendering usernames to the page, but must not be case sensitive in the backend since the username shows up in URLs.