logoalt Hacker News

nickandbrotoday at 4:43 PM6 repliesview on HN

Does well on SVGs outside of "pelican riding on a bicycle" test. Like this prompt:

"create a svg of a unicorn playing xbox"

https://www.svgviewer.dev/s/NeKACuHj

Still some tweaks to the final result, but I am guessing with the ARC-AGI benchmark jumping so much, the model's visual abilities are allowing it to do this well.


Replies

ertgbnmtoday at 9:30 PM

Animated SVGs are one of the example in the press release. Which is fine, I just think the weird SVG benchmark is now dead. Gemini has beat the benchmark and now differences are just coming down to taste.

I don't know if it got these abilities through generalization or if google gave it a dedicated animated SVG RL suite that got it to improve so much between models.

Regardless we need a new vibe check benchmark ala bicycle pelican.

simonwtoday at 4:44 PM

Interesting how it went a bit more 3D with the style of that one compared to the pelican I got.

pugiotoday at 10:21 PM

Unfortunately it still fails my personal SVG benchmark (educational 2d cross section of the human heart), even after multiple iterations and screenshots feedback. Oh well, back to the (human) drawing board.

andy12_today at 4:48 PM

I'm thinking now that as models get better and better at generating SVGs, there could be a point where we can use them to just make arbitrary UIs and interactive media with raw SVGs in realtime (like flash games).

show 2 replies
roryirvinetoday at 5:41 PM

On the other hand, creation of other vector image formats (eg. "create a postscript file showing a walrus brushing its teeth") hasn't improved nearly so much.

Perhaps they're deliberately optimising for SVG generation.

mclau153today at 7:33 PM

can we move on from SVG to 3D models at some point?

show 2 replies