logoalt Hacker News

gyrovagueGeistyesterday at 2:25 PM1 replyview on HN

Electricity (on continental US) is pretty cheap assuming you already have the hardware:

Running at a full load of 1000W for every second of the year, for a model that produces 100 tps at 16 cents per kWh, is $1200 USD.

The same amount of tokens would cost at least $3,150 USD on current Claude Haiku 3.5 pricing.


Replies

ac29yesterday at 2:42 PM

This 35B-A3B model is 4-5x cheaper than Haiku though, suggesting it would still be cheaper to outsource inference to the cloud vs running locally in your example