no. red green tdd is great because you'll have tests when your llm breaks something later, or you're doing a massive refactor. i imagine studies are not done on codebases where the complexity gets that high.
tdd has been invaluable for this project (almost entirely llm written, but i review it) https://github.com/ityonemo/clr
this is not really backed by any empirical evidence. there are simply more efficient means of verifying outputs than TDD.