I'm confused how anyone ever thought the NPU would be a good idea. The GPU is almost always underutilized on Mac and could do the brunt of the work for inference if it embraced GPGPU principles from the start. Creating a dedicated hardware block to alleviate a theoretical congestion issue is... bewildering. That goes for most NPUs I've seen.
Apple had the technology to scale down a GPGPU-focused architecture just like Nvidia did. They had the money to take that risk, and had the chip design chops to take a serious stab at it. On paper, they could have even extended it to iPhone-level edge silicon similar to what Nvidia did with the Jetson and Tegra SOCs.
I'm confused how anyone ever thought the NPU would be a good idea. The GPU is almost always underutilized on Mac and could do the brunt of the work for inference if it embraced GPGPU principles from the start. Creating a dedicated hardware block to alleviate a theoretical congestion issue is... bewildering. That goes for most NPUs I've seen.
Apple had the technology to scale down a GPGPU-focused architecture just like Nvidia did. They had the money to take that risk, and had the chip design chops to take a serious stab at it. On paper, they could have even extended it to iPhone-level edge silicon similar to what Nvidia did with the Jetson and Tegra SOCs.