The TinyGrad approach of going straight to the hardware is telling. Between that, Vulkan compute getting faster for inference (llama.cpp Vulkan backend is competitive now), and SYCL/oneAPI, it feels like the real threat to CUDA might not be ROCm at all but a fragmented set of alternatives that each bypass AMD's broken software stack entirely.