The intensity of competition between models is so intense right now they are definitely benchmaxxing...

rustyhancock • yesterday at 7:58 PM • 5 replies • view on HN

The intensity of competition between models is so intense right now they are definitely benchmaxxing pelican on bike SVGs and Will Smith spaghetti dinner videos.

Replies

bonesss • yesterday at 8:48 PM

Parallel hypothesis: the intensity of competition between models is so intense that any high-engagement high-relevance web discussion about any LLM/AI generation is gonna hit the self-guided self-reinforced model training and result in de facto benchmaxxing.

Which is only to say: if we HN-front-page it, they will come (generate).

stared • yesterday at 8:16 PM

There was Lenna for digital image compression (https://en.wikipedia.org/wiki/Lenna).

A pelican on a bike is SFW, inclusive, yet cool.

It is not a full benchmark - rather a litmus test.

➕ show 1 reply

bayindirh • yesterday at 8:05 PM

So, again, when the indicator becomes a target, it stops being a good indicator.

➕ show 3 replies

thatguysaguy • yesterday at 8:36 PM

You can just try other svgs, I got some pretty good ones.

(*Disclaimer: I work for Google, but also I have zero idea about what they trained deepthink on)

yieldcrv • yesterday at 8:13 PM

note that this benchmark aside, they've gotten really good at SVGs, I used to rely on the nounproject for icons, and sometimes various libraries, but now coding agents just synthesize an SVG tag in the code and draw all icons.

alt Hacker News

Replies