logoalt Hacker News

jbellisyesterday at 10:51 PM3 repliesview on HN

I feel bad that I wasted my time reading this.

On the points in the article:

1. Yes, "gain" is a vanity metric but it's harmless, nobody is being "fooled" here.

2. This could be a problem in principle, sure, but unless you're actually vetting bug reports you're just spreading FUD.

3. Again, do you have any reason to believe that the thousands of devs using rtk are silently tanking their performance without noticing? here's a thought: instead of reporting that SOMEONE SHOULD MEASURE THIS, you could, you know, measure it yourself.

4. Good lord, what is this doing in a purportedly technical article?

5. Yes, this is inherent in the problem domain, again, nobody is being "fooled".

Yes, I'm grumpy; reading this article was a waste of time.

Bias: had my first RTK pr accepted today, so I guess I probably know more about it than this guy who got offended by "gain" and spit out the first thoughts that came to mind.


Replies

lackoftacticsyesterday at 11:27 PM

1. Are you sure no one is fooled? It’s the main thing managers are praising rtk for and using as an argument for it’s validity. If this is gamed, then it paints a very different picture. 2. No, I didn’t vet all the reports. But they paint quite convincing picture of the problems present in the library, which has a very ambitious goals of handling every popular command and making it less verbose. 3. You know this is not a valid point. Engineers tanking performance and choosing based on hype is nothing new. Github stars and usage is not a valid argument, when the tool is not very transparent and could quietly fail. If it’s only couple percents less accuracy, most wouldn’t easily recognize it with the whole stack of skills, mcps and agents.md 4. Is it something more than a feature? If the benefit is $3 on $900 as other commenter pointed out using maybe better and well researched article than mine from codepointer, why would I risk that for all the possible bugs and worse accuracy. 5. Hard to address this one. Tough problem domain to handle with endless cli commands to capture and process properly.

Congratulations on your accepted PR. I didn’t want to make you grumpy today. If you feel I am wrong, it’s very possible. I am just a guy who wrote my point of view, it doesn’t automatically make it valid. Once again sorry for making you grumpy.

beepbooptheoryyesterday at 10:56 PM

How is 1 not more damning? It sounds like the fundamental service they are purportedly providing is not real. Am I reading it wrong?

0123456789ABCDEtoday at 12:23 PM

hope you feel better knowing your effort, reading and then commenting, is appreciated here, and convinced me to read OP's article. it's short, and raises valid points, but i'm left wondering why your reply is so defensive

let me try that style

  1. it's not *just vanity* if it feeds into *rtk*'s pitch. it's the hook, it's meant to convince users, *rtk* will reduce token waste.
  2. OP's article is not spreading fear, uncertainty, or doubt. at best it disputes *rtk*'s claims that it is effective in reducing token waste, and it does so directly with the question: "Where Are the Accuracy Benchmarks?"
  3. a) *beep* - you are disqualified for failing to identify the *burden of proof* obligation lies with *rtk*, not OP; b) OP made no claims, except for the ones you conveniently dismiss — the github issues. furthermore the "reason[s] to believe that the thousands of devs using rtk are silently tanking their performance without noticing" was already answered. you missed it because you couldn't see past the joy of having your pull-request recently merged.
  4. really, you were so disturbed by the article, you couldn't even ignore the *one* non-technical point, in an article *you choose* to interpret as being technical — all of it being your own fault. nevermind how relevant it is as a signal for the effectiveness of such technics.
  5. is it inherent? are we doomed to live with broken tool outputs? note, the issue, here, is not that *rtk* will fail when output changes, *that* is inherent to *rtk*'s current implementation — as i understand it, but that "it will fail quietly, feeding corrupted or partial text to your agent".
you are not better informed, than gp, because you have commits to your name in rtk. you're just biased by the proximity. we're all at a loss for how effective rtk is, because there are no benchmarks measuring its performance beyond some "vanity metric[s]".

you were so close to getting it here:

> instead of reporting that SOMEONE SHOULD MEASURE THIS, you could, you know, measure it yourself

but hey, thanks for getting me to take another look at rtk & co., i am now further convinced these are just the flavor of the month tricks for speed running context rot