The same 24GB VRAM RTX 4090 I bought to play Cyberpunk 2077 with. Works perfectly fine in llama.cp...

hypfer • yesterday at 3:46 PM • 4 replies • view on HN

The same 24GB VRAM RTX 4090 I bought to play Cyberpunk 2077 with.

Works perfectly fine in llama.cpp throwing 70+t/s at me with 128k q8 K/V context when using the IQ4_NL quant + MTP at q4 MTP K/V.

Also leaving this here because you might find it useful: https://hypfer.github.io/will-it-fit-llama-cpp/

Replies

Rzor • yesterday at 10:28 PM

Can you fix MTP-GEMMA-4-26B-A4B-IT? It says the weights are 0.5 GB in size.

edit: nvm, I'm confusing models.

indoordin0saur • yesterday at 3:54 PM

Nice! Do you do anything with that compute when you're not actively using it? Is the crypto-mining hobby still worth it? I've also wondered if such expensive hardware can be rented back out to offset cost. Looks like these cards are going for as much as $4k nowadays.

➕ show 3 replies

cdelsolar • yesterday at 3:52 PM

What did you call me?

alt Hacker News

Replies