logoalt Hacker News

gamma-interfaceyesterday at 3:39 PM3 repliesview on HN

The pace of commoditization in image generation is wild. Every 3-4 months the SOTA shifts, and last quarter's breakthrough becomes a commodity API.

What's interesting is that the bottleneck is no longer the model — it's the person directing it. Knowing what to ask for and recognizing when the output is good enough matters more than which model you use. Same pattern we're seeing in code generation.


Replies

sincerelyyesterday at 9:25 PM

PLEASE STOP POSTING AI GENERATED COMMENTS

SV_BubbleTimeyesterday at 5:17 PM

SOTA shifts, yes. But the average person doing the work has been very happy with SDXL based models. And that was released two years ago.

The fight right now outside of API SOTA is who will replace SDXL to be the “community preference”

It’s now a three way between Flux2 Klein, Z-Image, and now Qwen2.

echelonyesterday at 8:09 PM

I'm happy the models are becoming commodity, but we still have a long way to go.

I want the ability to lean into any image and tweak it like clay.

I've been building open source software to orchestrate the frontier editing models (skip to halfway down), but it would be nice if the models were built around the software manipulation workflows:

https://getartcraft.com/news/world-models-for-film