> In a tight loop you'd want your cleanup to happen after the fact.
Why? Doing 10 000 iterations where each iteration allocates and operates a resource, then later going through and freeing those 10 000 resources, is not better than doing 10 000 iterations where each iteration allocates a resource, operates on it, and frees it. You just waste more resources.
> And in, say, an IO loop, you're going to want concurrency anyway
This is not necessarily true; not everything is so performance sensitive that you want to add the significant complexity of doing it async. Often, a simple loop where each iteration opens a file, reads stuff from it, and closes it, is more than good enough.
Say you have a folder with a bunch of data files you need to work on. Maybe the work you do per file is significant and easily parallelizable; you would probably want to iterate through the files one by one and process each file with all your cores. There are even situations where the output of working on one file becomes part of the input for work on the next file.
Anyway, I will concede that all of this is sort of an edge case which doesn't come up that often. But why should the obvious way be the wrong way? Block-scoped defer is the most obvious solution since variable lifetimes are naturally block-scoped; what's the argument for why it ought to be different?