logoalt Hacker News

smusamashahtoday at 3:59 AM3 repliesview on HN

This is just img2img where first image with correct structure was generated by code.


Replies

vunderbatoday at 5:41 AM

Yup, that’s exactly what this is. If you’ve been using generative models since the early Stable Diffusion days, it’s a pretty common (and useful!) technique: using a sketch (SVG, drawn, etc) as an ad-hoc "controlnet" to guide the generative model’s output.

Example: In the past I'd use a similar approach to lay out architectural visualizations. If you wanted a couch, chair, or other furniture in a very specific location, you could use a tool like Poser to build a simple scene as an approximation of where you wanted the major "set pieces". From there, you could generate a depth map and feed that into the generative model, at the time SDXL, to guide where objects should be placed.

jasonjmcgheetoday at 4:16 AM

Pretty much what the author said- just gave some context for the uninitiated

philsnowtoday at 4:46 AM

Right, but you can use a different (codegen) model to make that code.