This matches what I'm seeing too. The refactoring and test generation use cases are where it actually saves real time. The tricky part is longer sessions where you lose track of context window and quota usage.