More generally: LLM effectiveness is inversely proportional to domain specificity. They are very good at producing the average, but completely stumble at the tails. Highly particular brownfield optimization falls into the tails.