Thank you. Strange. If the memory numbers are accurate, it is so slow because likely layers are loaded from disk before inference of each layer or something like that, otherwise it could not do the inference of such model in 500MB. But if that's what it does, 33% of the speed would be already too fast, likely.