The pelican is a lot : | alt Hacker News

simonw • yesterday at 7:29 PM • 24 replies • view on HN

The pelican is a lot: https://github.com/simonw/llm-gemini/issues/133#issuecomment...

Not a great bicycle though, it forgot the bar between the pedals and the back wheel and weirdly tangled the other bars.

Expensive too - that pelican cost 13 cents: https://www.llm-prices.com/#it=11&ot=14403&sel=gemini-3.5-fl...

Replies

hedgehog • yesterday at 7:32 PM

That pelican looks like it's in Miami for a crypto conference.

➕ show 10 replies

irthomasthomas • yesterday at 7:48 PM

This is a perfect illustration of something I noticed with llm progress. Ask them to improve an svg like this, and it never fixes the missing crossbar or disconnected limbs, it just adds more stuff. In this example they have obviously improved greatly, and it contains a ridiculous amount of detail, but they still to get the basic shape of the frame wrong. It's weird. And the pattern shows up everywhere, try it with a webpage and it will add more buttons and stuff. I've even experimented with feeding the broken pelican svgs to an image model to look for flaws, and they still fail to spot the broken elements.

edit: fixed human hallucination

➕ show 7 replies

tantalor • yesterday at 8:01 PM

Forgetting the chainstay is typical of asking random people to draw a bicycle.

https://www.gianlucagimini.it/portfolio-item/velocipedia/

> most ended up drawing something that was pretty far off from a regular men’s bicycle

➕ show 2 replies

VectorLock • today at 4:40 AM

The fact it went for vaporwave styling on its own is very telling.

smcleod • yesterday at 7:55 PM

I feel like it embodies Google's vibe of an uncool guy trying to stay relevant to the youth pretty well.

➕ show 1 reply

tandr • today at 12:18 AM

If you sort that table by "output token price", it gets really terrifying - going from 4 cents up to $600 =8-O

nrds • today at 12:27 AM

We've been daily-driving this model for a few weeks and let me tell you, everything it does is a lot. Fast as fuck and it's actually not bad intelligence-wise for a fast model. It basically tries to make up for any intelligence deficit by just doing a lot, checking a lot, retrying a lot.

That's not to say I don't spend my days raging at it... a lot... but it's not that bad. It does tend to ignore completion criteria but it doesn't obviously degrade when being nudged like some models do.

dekhn • today at 12:55 AM

I'm told there is a new Jeff Dean fact inside google: "Jeff Dean manually adjusts the weights in the model just to screw with Simon".

karmakaze • today at 1:01 AM

I'm hoping we'll have many of these pelican cyclist pictures collected. Then when all the models can do it well, we'll stop posting about them, and dhen the next generations of AIs train on the data we'll have these canonical archetypes.

bee_rider • today at 2:22 AM

I wonder if they added all these unrequested details as an Easter-egg or something? (Since they must be aware of your test by now).

hydra-f • yesterday at 7:38 PM

Same old issue with Gemini models trying to "enrich" everything

taurath • today at 12:40 AM

I can’t help but think that what AI is best at is convincing management that things it creates are full featured which reads to their brains as mature

nickvec • yesterday at 9:31 PM

I enjoy the vaporwave aesthetic it went for. Looks like the pelican has a fish in its mouth too?

https://en.wikipedia.org/wiki/Vaporwave

khy • yesterday at 8:56 PM

That sun is very similar to the one from the background of this other top HN post about the OS museum: https://news.ycombinator.com/item?id=48195009

sbinnee • yesterday at 10:16 PM

Wow what’s with all the styling? Is it manifestation of google’s styling bias? I like the result for sure. It’s shiny and pretty. But then it’s something I didn’t ask for.

danilocesar • today at 12:18 AM

Given your pelican is very famous now, don't you think they are adding instructions to beat this benchmark those days?

➕ show 1 reply

Razengan • today at 1:56 AM

I've found prompts like "capybara with spotted fur and 7 octopus tentacles instead of legs, each a different color, riding a tricycle" etc. to be a better test

Last time I tried, ChatGPT's image generator got the best result.

setgree • yesterday at 9:05 PM

``

wtf

``

WTF??

__mharrison__ • yesterday at 10:03 PM

They are just trolling you now

gcgbarbosa • yesterday at 7:53 PM

funny that when I try the same prompt, gemini generates an image, not an SVG. something is not right.

➕ show 1 reply

nashashmi • yesterday at 7:44 PM

Beats a human by like 10$

➕ show 1 reply

TacticalCoder • yesterday at 10:30 PM

Love your pelicans, as always. And that one is... Wow.

I noticed the "Synthwave" aesthetic, which is enjoying quite some success since quite some time now, has found its way into AI models (even when it's not in the user's query). It's not the first time I see the sun at sunset with color bands etc. in AI-generated pictures. Don't know why it's now taking on in AI too.

https://en.wikipedia.org/wiki/Synthwave

Hence the comments here about the 90s, Sonny Crockett's white Ferrari Testarossa in Miami, etc.

To be honest as a kid from the 80s and a teenager from the 90s who grew up with that aesthetic in posters, on VHS tape covers, magazine covers, etc. I do love that style and I love that it made a comeback and that that comeback somehow stayed.

➕ show 2 replies

holtkam2 • yesterday at 7:55 PM

at a certain point you're gonna need to change your benchmark because this will end up in the model's training set

➕ show 2 replies