>In your example, z-image and Nano Banana Pro look basically equally photorealistic to me
I disagree, nano banana pro result is on a completely different league compare to flux.2 and z-image.
>But anyway, realistic environments like a street cafe are not suited to test for photorealism
Why? It's the perfect settings in my opinion.
Btw I don't think you are using nano banana pro, probably standard nano banana, I'm getting this from your prompt: https://i.ibb.co/wZHx0jS9/unnamed-1.jpg
>Nano Banana Pro is a Gemini scaffold which uses Imagen to generate an initial image, then Gemini 3 Pro writes prompts to edit the image for much better prompt alignment.
First of all how should you know the architecture details of gemini-3-pro-image, second of all how the model can modify the image if gemini itself is just rewriting the prompt (like old chatgpt+dalle), imagen 4 is just a text-to-image model, not an editing one, it doesn't make sense, nano banana pro can edit images (like the ones you can provide).
> I disagree, nano banana pro result is on a completely different league.
I strongly disagree. But even if you are right, the difference between the cafe shots and the Atlantis shots is clearly much, much larger than the difference between the different cafe shots. The Atlantis shots are super unrealistic. They look far worse than the cafe shots of Flux.2 Pro.
> Why? It's the perfect settings in my opinion
Because it's too easy obviously. We don't need an AI to make fake realistic photos of realistic environments when we can easily photograph those ourselves. Unrealistic environments are more discriminative because they are much more likely to produce garbage that doesn't look photorealistic.
> Btw I don't think you are using nano banana pro, I'm getting this from your prompt: https://i.ibb.co/wZHx0jS9/unnamed-1.jpg
I'm definitely using Nano Banana Pro, and your picture has the same strong AI look to it that is typical of NBP / Imagen 4.
> First of all how should you know the architecture details of gemini-3-pro-image, second of all how the model can modify the image if gemini itself is just rewriting the prompt (like old chatgpt+dalle), imagen 4 is just a text-to-image model, not an editing one, it doesn't make sense, nano banana pro can edit images (like the ones you can provide).
There were discussions about it previously on HN. Clearly NBP is using Gemini reasoning, and clearly the style of NBP strongly resembles Imagen 4 specifically. There is probably also a special editing model involved, just like in Qwen-Imahe-2.0.