Honestly, it sounds like, assuming you have no ethical qualms, you could get by with a Mac or AMD 395+ and the newest models, specifically QWEN3.5-Coder-Next. It does exactly as you describe. It maxes out around 85k context, which if you do a good job providing guard rails, etc, is the length of a small-medium project.
It does seem like the sweet spot between WallE and the destroyed earth in WallE.
Seems like AMD 395+ is only about 16 tokens/s which is 25-33% the speed of SOTA models. Break even on a $3000 machine is ~15 months
Sorry, out of the loop. Which ethical qualms are you referring to?