logoalt Hacker News

shotoday at 6:58 AM2 repliesview on HN

So, this is the version that's able to serve inference from Huawei chips, although it was still trained on nVidia. So unless I'm very much mistaken this is the biggest and best model yet served on (sort of) readily-available chinese-native tech. Performance and stability will be interesting to see; openrouter currently saying about 1.12s and 30tps, which isn't wonderful but it's day one after all.

For reference, the huawei Ascend 950 that this thing runs on is supposed to be roughly comparable to nVidia's H100 from 2022. In other words, things are hotting up in the GPU war!


Replies

alpinemantoday at 7:48 AM

Can't see how NVIDA justifies its valuation/forward P/E ratio with these developments and on-device also becoming viable for 98% of people's needs when it comes to AI

show 1 reply
npodbielskitoday at 7:23 AM

Great! Can't wait to buy decent GPU for interference for <1k$