Electricity (on continental US) is pretty cheap assuming you already have the hardware: Running at...

gyrovagueGeist • yesterday at 2:25 PM • 1 reply • view on HN

Electricity (on continental US) is pretty cheap assuming you already have the hardware:

Running at a full load of 1000W for every second of the year, for a model that produces 100 tps at 16 cents per kWh, is $1200 USD.

The same amount of tokens would cost at least $3,150 USD on current Claude Haiku 3.5 pricing.

Replies

ac29 • yesterday at 2:42 PM

This 35B-A3B model is 4-5x cheaper than Haiku though, suggesting it would still be cheaper to outsource inference to the cloud vs running locally in your example

alt Hacker News

Replies