Intel Arc Pro B70 Review

178 points • by zdw • last Friday at 4:44 AM • 108 comments • view on HN

Comments

2001zhaozhao • yesterday at 10:52 PM

There's a tradeoff between dense models and MoEs on memory usage vs. compute for the same quality.

For example, Qwen3.5 27B and Qwen3.5 122B A10B have similar average performance across benchmarks. The 122B is much faster to run than the 27B (generates more tokens at the same compute). The 27B, on the other hand, uses ~4x less VRAM at low context lengths (less difference at high context lengths).

Right now, different hardware seems to be suited to different points in the dense vs. MoE balance. On one extreme is hardware like the DGX Spark and Strix Halo which have a lot of memory compared to compute performance and memory bandwidth, and are best-suited for MoE workflows. On the other extreme you have cards like RTX 5090 which have very high performance for the price but rather little memory, and is best suited for dense models.

The Arc Pro B70 seems to be the awkward middle. With 1-2 of these, you can run a ~30B dense model slowly, probably not fast enough to be useful interactively (you'd probably need a 5090 or 2x 3090 for that). Or, you can run a MoE model at high throughput, but probably not enough quality to support agentic workflows that actually use your throughput.

➕ show 4 replies

greybcg • today at 10:06 AM

Something that is also cool with these cards is proper SR-IOV without hassle. Arc pro cards make for nice graphical acceleration devices for vms. I know ai gets all the hype but I also appreciate being able to accelerate multiple workstations with a single gpu and still get decent frametimes.

dwoldrich • today at 2:17 AM

Hi Intel, I'm itching to buy an Xe3P! Or, Nova Lake? Crescent Island? Celestial? Jaguar Shores?

Whatever the hell you name it doesn't matter to me, I just want a workstation with one of them bad boys attached to 160GB of RAM for legit inference power!

I've been saving my money not paying for Claude Code so I can run my own agentic coding setup at home on yours. Please don't charge too much for the workstation class card if you can at all manage it. Maybe give us a discount to preorder? Please don't price a regular consumer like me out of the market!

Also, I am speculating integer based models will become hot due to lower memory and power requirements. Will the Xe3P be able to do integer-based math inference to use all that RAM to even greater effect?

➕ show 2 replies

speedgoose • yesterday at 9:14 PM

Time to first token is a very important performance metric, as I figured out using a Mac Studio M3 Ultra (that is quite slow on this aspect).

But 32GB for a TDP of 230W is perhaps not super interesting. Especially because you probably want to have more than one card. It's a lot of heat. You could use the cards for heating up a building, but heatpumps exist.

➕ show 1 reply

ycui7 • today at 3:32 AM

Intel Arc B70 when released, can only produce 1/3 of the token of RTX PRO 4500. Well, it also cost 1/3 of RTX PRO 4500.

It lacked software support the for the primary target application, running LLM. The officially supported vllm fork is 6 version behind mainline. It did not run the latest hot new open models on huggingface. Parallel two of B70 reduce token rate, not improve it. So, the software behind B70 is basically so far behind.

➕ show 2 replies

SparkyMcUnicorn • yesterday at 9:22 PM

Here are some llama.cpp benchmarks for it: https://www.phoronix.com/review/intel-arc-pro-b70-linux/3

➕ show 2 replies

arjie • yesterday at 10:12 PM

I was looking into this for LLMs but it's clearly a graphics-processing focused card. The memory bandwidth is too low for that much RAM to be useful in an LLM context. The 5090 I have has the same amount of RAM but far more bandwidth and that makes it much more useful.

➕ show 3 replies

kinow • yesterday at 10:09 PM

For those that use Blender, in their section about Blender:

> We hope that, in the future, there will be real options other than NVIDIA for GPU-based rendering, as it is an area where competition is nearly non-existent.

And Checking opendata.blender.org, a NVIDIA GeForce RTX 4080 Laptop GPU scores 5301.8, while Intel Arc Pro B70 is still at 3824.64.

So there is still a bit more to go before Intel GPUs perform close to NVIDIA's.

➕ show 2 replies

MostlyStable • yesterday at 9:11 PM

Is Intel still making GPUs? I have heard so many conflicting things about will they/won't they stay in the market.

➕ show 7 replies

tempest_ • yesterday at 9:07 PM

I would like one for the vram but I am sure they will be unobtainable after the initial stock sells out as I assume they were produced before the RAM prices went up.

userbinator • today at 12:26 AM

It should be possible to use the VRAM as extra swap space, when you're not using it for AI or gaming or anything else. 32GB is already more than a lot of computers have as just regular RAM, even sufficient to hold an OS installation:

https://www.tomshardware.com/news/lightweight-windows-11-run...

ycui7 • today at 3:28 AM

It is weird that the reviewer does not mention RTX PRO 6000 96GB, but mentioned RTX PRO 5000 72GB. 72GB RTX PRO 5000 is a special order, and much less people are aware of it. RTX PRO 6000 is known by mostly everyone in the LLM world.

I cannot understand why would a tech reviewer do that.

jbellis • yesterday at 11:09 PM

How should I update my simplistic understanding that decode is bw-bound with these results that show the B70 decoding faster than a 4090 (about 50% more bw)?

➕ show 1 reply

XCSme • yesterday at 9:09 PM

Can you use those AI cards for gaming too?

Or the makers intentionally nerf them, in order to better segment the markets/product lines?

➕ show 2 replies

numpad0 • yesterday at 10:07 PM

$950 for 23TF fp32? Have GPU performance grew in past 5-10 years at all?

➕ show 2 replies

wg0 • today at 9:42 AM

Can we not have a PCIe card that's ASIC (and isn't GPU) with even DDR 4 or DDR 5 memory (Let's say 128 GB) onboard and being able to shove four of them on a consumer grade motherboard and then being utilized in parallel?

Noob question.

➕ show 1 reply

100ms • yesterday at 8:53 PM

These seem amazing for hobbyist, but that TDP given the perf might be an issue deploying a lot of them

➕ show 1 reply

unethical_ban • yesterday at 10:10 PM

It looks like, if one can afford it, the R9700 is worth the extra money.

I read that Intel is getting out of the dGPU space, but then again, their iGPUs are really getting good. I can't understand why they'd give up the space when the AI market is so insane.

➕ show 3 replies

cubefox • yesterday at 10:03 PM

Why are they still using their old Xe2/Battlemage architecture rather than their new Xe3/Celestial? They already used it in their Panther Lake chipset.

➕ show 2 replies

driverdan • yesterday at 9:26 PM

From what I've read the Intel drivers are terrible and holding back using them for LLMs.

➕ show 3 replies

luckydata • today at 12:01 AM

this review was essentially pointless, they reviewed the card for a ton of workloads nobody in their right mind would pick it for, and left out the only use case where it makes sense. great job?

alt Hacker News

Intel Arc Pro B70 Review

Comments