logoalt Hacker News

stymaarlast Monday at 7:00 PM1 replyview on HN

> That 3090 is going to burn 750W

The 3090's TPD is 350W, but given that LLM's token generation isn't compute bound, people usually undervolt these cards to reduce power consumption. IIRC you can get as low as 200-250W without any degradation. Caveat these figures are without speculative decoding and at batch size =1.


Replies

4chandailylast Monday at 7:15 PM

This is correct. I have (4) 3090s in my inference server, and they are each capped at 250w. I run Qwen 3.5 122B-A10 at about 45-50tok/s on this and am quite happy with it. At idle it draws around 95-105w for all four, which is a bit high, but tolerable.