Their moat is cuda and cuda libraries and everything built on top. When a new architecture drops, ...

ismailmaj • today at 12:01 PM • 3 replies • view on HN

Their moat is cuda and cuda libraries and everything built on top.

When a new architecture drops, it's always PyTorch running on CUDA, other PyTorch backends are best effort, even if they reach feature parity, many industry power users went closer to the metal to squeeze performance and that stuff is too specific to Nvidia stuff.

if there is something that will beat Nvidia, it won't be something reaching feature parity with slightly better economics (like AMD, also Nvidia could just reduce their margins), it needs to be a novel approach worth rewriting the codebase for (maybe Cerebras, maybe a new player).

Replies

HarHarVeryFunny • today at 5:11 PM

> Their moat is cuda and cuda libraries and everything built on top

Sure, but to state the obvious that is only a factor for people using CUDA !

There are also whole segments of the AI market, like Google using TPUs, Amazon using Trainium chips where CUDA is irrelevant.

If the AI boom is really going to happen, then inference volume needs ramp up and dominate training costs, and the winners are going to be whoever can do inference the cheapest, which probably isn't going to be anyone paying the NVIDIA tax !

The benefit of CUDA is more for development, and the hyperscalers serving models that use CUDA APIs - bespoke business models. Anthropic currently support both CUDA and Trainium, and X.ai (who seem to be fizzling out) are CUDA, although there was some talk of Musk getting Samsung to make "AI chips" of some sort.

As far as AMD goes, I'm sure the developers at AMD's biggest sites - the exascale national labs - have a whole other level of support than consumers, and no doubt a toolset that works great for those fixed environments.

0xDEAFBEAD • today at 12:42 PM

I don't understand why AMD can't offer a drop-in replacement for cuda which implements an identical API.

How much actual diversity is there among standard AI workloads? I would expect this is an 80/20 thing where 80% of the workload uses 20% of the features.

>Nvidia could just reduce their margins

Commoditization is great for stock prices ;-)

➕ show 3 replies

twobitshifter • today at 3:00 PM

At some point there will be models that are ‘good enough’ and run on chinese chips, mobile processors, and run of the mill chips from Apple. Whether this is a one bit ternary model, innovations to limit the size of the context, or something else it is coming. The balance has already shifted to making these systems less resource intensive which is a clear need based on the enormous data center cost.

alt Hacker News

Replies