logoalt Hacker News

skysniperyesterday at 4:45 PM1 replyview on HN

sorry didn't know that. Here is my hand writing tldr:

gemini is very unreliable at using skills, often just read skills and decide to do nothing.

stepfun leads cost-effectiveness leaderboard.

ranking really depends on tasks, better try your own task.


Replies

refulgentisyesterday at 4:56 PM

It’s too late once it’s happened. I was curious, then when I saw the site looked vibecoded and you’re commenting with AI, I decided to stop trying to reason through the discrepancies between what was claimed and what’s on the site (ex. 300 battles vs. only a handful in site data).

show 2 replies