Love this.
You mention futures are cooperative and GPUs lack interrupts, but GPU warps already have a hardware scheduler that preempts at the instruction level. ARe you intentionally working above that layer, or do you see a path to a fture executor that hooks into warp scheduling more directly to get preemptive-like behavior?