remote GPU compute payloads have been around a lot longer than LLMs, they're just few and far between.
folding@home and other such asynchronous "get this packet of work done and get back to me' style of operations rarely care much about latency.
Remote transcoding efforts can usually adjust whatever buffer needed to cover huge latency gaps , a lot of sim and render suites can do remote work regardless of machine to machine latency..
I just sort of figure the industry will trend more async when latency becomes a bigger issue than compute. Won't work in some places, but I think we tend to avoid thinking that way right now due to a lack of real need to do so; but latency is one of those numbers that trends down slowly.