the context-specificity problem you're describing is exactly why the draft/execute divide is so persistent across AI use cases.
it's not a model capability problem. it's an architecture problem: the relevant context is distributed across systems (the priest's knowledge of their parish, history, relationships) that nobody has wired into the workflow. a homily generator without that context produces generic output. a priest who knows their community produces something unreplicable.
same pattern shows up in ops work. every ops request looks like a generic task -- 'update contract status,' 'respond to renewal question' -- but the context required to do it well is scattered across CRM, email threads, slack history, billing records. automate the task without the context and you get confident, generic output that's often wrong. the hard problem isn't drafting, it's knowing which context matters for this specific request before you act on it.