If most of the performance gains are hidden in a few middle layers, you can save a massive amount of compute by freezing the rest