>for applications like AI, even using system RAM is often considered too slow, simply because of the distance to the GPU
That's not why. It's because RAM has a narrower bus than VRAM. If it was a matter of distance it'd just have greater latency, but that would still give you tons of bandwidth to play with.
You could be charitable and say the bus is narrow because it has to travel a long distance and this makes it hard to have a lot of traces.