For image generation or even video generation, local models are totally feasible. I can generate a 5 second clip with wan 2.2 in about 30 minutes on my 3060 12G. Plus, I have full control on the loras used.