> GPU compute (not rasterization) it’s between an M4 Pro and M4 Max without considering bandwidth...

storus • today at 4:56 PM • 1 reply • view on HN

> GPU compute (not rasterization) it’s between an M4 Pro and M4 Max without considering bandwidth

You are likely thinking about token generation which is dependent on memory bandwidth where Apple has an edge. Spark's GPU compute is way higher than even M5 Max (17 FP32 TFlops), around 2x FP32 TFlops... It's literally 6144 CUDA cores like desktop 5070, slowed down by slow memory and lower TDP (29.7 vs 31 FP32 TFlops on 5070).

Replies

dagmx • today at 5:56 PM

That’s only if you consider FP32 specifically. On average the M5 Max will pull ahead for tasks like GPU raytracing (it’s currently the fastest mobile GPU for Blender rendering) and token generation and other things that benefit from the higher memory bandwidth.

I’d also mention that you’re comparing peaks which the RTX Spark won’t be hitting. The top TDP is less than that of the DGX Spark.

I just think anyone calling this a beast and a game changer are conflating/extrapolating from different form factors and constraints

➕ show 1 reply

alt Hacker News

Replies