logoalt Hacker News

aleccoyesterday at 11:14 PM0 repliesview on HN

> directly to PTX

Weird. There's a recent NVIDIA MLIR that is quite good and fast. Or they could target the even easier and more recent/fashionable tile IR [1] used by CuTile [2] (a little bit higher level but significantly easier to target, only loses on epilogue fusion and similar).

[1] https://docs.nvidia.com/cuda/tile-ir/

[2] https://developer.nvidia.com/cuda/tile