Why would you give this sort of work to a machine that can't be responsibly used without checking its output anyway?
It's not obvious to me that LLMs can't be made reliable.
It's not obvious to me that LLMs can't be made reliable.