Are you using the commercial version of Saxon? It's not expensive, and IMHO worth it for the features it supports (including the newer standards) and the performance. If I remember correctly (it was a long time ago) it does some clever optimizations.
The final free version of Saxon is a lot faster than earlier ones too. My guess is that it compiles the XSLT in some way for the JVM to use.
We didn't use Saxon, I don't work there anymore. We also supported client-side (browser) XSLT processing, as well as server-side. It might have helped on the server side, maybe could even resolve some algorithmic complexities with some memoization (possibly trading off memory consumption).
But in the end the core problem is XSLT, the language. Despite being a complete programming language, your options are very limited for resolving performance issues when working within the language.