> Also, note that there's zero CUDA dependency. So does this mean I can run this on AMD? A...

d3Xt3r • today at 9:17 AM • 3 replies • view on HN

> Also, note that there's zero CUDA dependency.

So does this mean I can run this on AMD? And on a consumer 9000 series card?

Replies

If you don't have the source code then it makes no difference. If you have the weights and are running some model via llama.cpp, then you are using whatever API llama.cpp is using, not the API that was used to train the model or that anyone else may be using to serve it.

randomgermanguy • today at 9:41 AM

If you found a rare 9000 card with 200+ GB of VRAM, sure

Eisenstein • today at 1:49 PM

If the card supports vulkan and the model has gguf weights. llamacpp has excellent vulkan support that is being actively developed and is not that far behind CUDA where speed is concerned.

* https://github.com/ggml-org/llama.cpp/releases

alt Hacker News

Replies