Benchmarks are here: https://kyuz0.github.io/amd-strix-halo-vllm-toolboxes/
Would love to see DeepSeek V4 flash/pro and MiniMax M3 benchmarks but already these are pretty impressive, first strix Halo setup I've seen with some serious performance.
EDIT: Apologies - I think I misunderstood these benchmarks - it seems this is actually very slow when compared to a M4 or M5 chip with a good amount of memory. Looking at the creators video here: https://youtu.be/Cfl3TS7ME5s?t=734 -- it seems the performance of strix halo is much much slower than I get on my M4 MBP - which gets ~400 prefill and ~20 tok/s generation
The pp speeds are really slow (50), I think there‘s room for improvement still.
They are heavily bogged down by bandwidth unfortunately. The macs are on another level. If Apple decides to release AI dedicated hardware, it would dominate this space (consumer AI).