It's already greatly improved over previous generations due to M5s having tensor cores (higher compute capacity for matmul operations, the bottleneck for prefill).