logoalt Hacker News

d3Xt3rtoday at 9:17 AM3 repliesview on HN

> Also, note that there's zero CUDA dependency.

So does this mean I can run this on AMD? And on a consumer 9000 series card?


Replies

HarHarVeryFunnytoday at 11:15 AM

If you don't have the source code then it makes no difference. If you have the weights and are running some model via llama.cpp, then you are using whatever API llama.cpp is using, not the API that was used to train the model or that anyone else may be using to serve it.

randomgermanguytoday at 9:41 AM

If you found a rare 9000 card with 200+ GB of VRAM, sure

Eisensteintoday at 1:49 PM

If the card supports vulkan and the model has gguf weights. llamacpp has excellent vulkan support that is being actively developed and is not that far behind CUDA where speed is concerned.

* https://github.com/ggml-org/llama.cpp/releases