logoalt Hacker News

cedwsyesterday at 12:41 PM1 replyview on HN

>Multimodal by design: Gemma 3n natively supports image, audio, video, and text inputs and text outputs.

But I understood your point, Simon asked it to output SVG (text) instead of a raster image so it's more difficult.


Replies

simonwyesterday at 1:12 PM

It can handle image and audio inputs, but it cannot produce those as outputs - it's purely a text output model.

show 1 reply