logoalt Hacker News

zozbot234yesterday at 7:10 PM1 replyview on HN

People keep saying this but I'm not seeing the big difference with other NPU varieties. Either way we're still talking about very experimental stuff that also tends to be hardwired towards some pre-determined use case. So I'm not surprised that people are running into problems while trying to make these more broadly useful.


Replies

wtallisyesterday at 7:38 PM

True; everybody's NPU hardware is afflicted by awkward hardware and software constraints that don't come close to keeping pace with the rapidly-shifting interests of ML researchers.

To some degree, that's an unavoidable consequence of how long it takes to design and ship specialized hardware with a supporting software stack. By contrast, ML research is moving way faster because they hardly ever ship anything product-like; it's a good day when the installation instructions for some ML thing only includes three steps that amount to "download more Python packages".

And the lack of cross-vendor standardization for APIs and model formats is also at least partly a consequence of various NPUs evolving from very different starting points and original use cases. For example, Intel's NPUs are derived from Movidius, so they were originally designed for computer vision, and it's not at all a surprise that making them do LLMs might be an uphill battle. AMD's NPU comes from Xilinx IP, so their software mess is entirely expected. Apple and Qualcomm NPUs presumably are still designed primarily to serve smartphone use cases, which didn't include LLMs until after today's chips were designed.

It'll be very interesting to see how this space matures over the next several years, and whether the niche of specialized low-power NPUs survives in PCs or if NVIDIA's approach of only using the GPU wins out. A lot of that depends on whether anybody comes up with a true killer app for local on-device AI.

show 2 replies