Wouldn't this kind of architecture yield a slower compiler, regardless of output quality? Conceptually, trying to implement the least-amount of passes with each doing as much work as possible would make more sense to me.
There is nothing stopping you from building an old-fashioned single-pass compiler, if compile time is your only concern. The code it generates just wouldn't be very good.
Optimization level 2 in chez scheme does about 100 KLOC/s in my pretty modest machine, while also producing code that is pretty darn fast.