Apple is basically in the same boat as AMD and Intel. They have a weak, raster-focused GPU architecture that doesn't scale to 100B+ inference workloads and especially struggles with large context prefill. TPUs smoke them on inference, and Nvidia hardware is far-and-away more efficient for training.
This doesn't get talked about enough - the GPU is weak, weak, weak. And anyone who can fix them will go to a serious AI company (for 2-3x the salary).
Apple is in a much better boat than AMD or Intel. They have a gigantic warchest and can just snap up whoever looks like a leader coming out of the bubble burst.
What do TPUs do to improve on GPUs at inference?