One of the best parts about GC languages is they tend to have much more efficient allocation/freeing because the cost is much more lumped together so it shows up better in a profile.
When it works. Many programs in GC language end up fighting the GC by allocating a large buffer and managing it by hand anyway because when performance counts you can't have allocation time in there at all. (you see this in C all the time as well)
Any extra throughput is far overshadowed by trying to control pauses and too much heap allocations happening because too much gets put on the heap. For anything interactive the options are usually fighting the gc or avoiding gc.
Agreed, however there is also a reason why the best ones also pack multiple GC algorithms, like in Java and .NET, because one approach doesn't fit all workloads.