This is the kind of porting work I always hope for when I see a CUDA-only release. Have you thought ...

sebakubisz • today at 7:26 AM • 1 reply • view on HN

This is the kind of porting work I always hope for when I see a CUDA-only release. Have you thought about publishing the gather-scatter sparse 3D convolution and SDPA attention swaps as a standalone toolkit or writeup? A lot of folks running models locally on Apple Silicon hit the same wall with flash_attn, nvdiffrast, and custom sparse kernels and end up redoing the same work.

Replies

shivampkumar • today at 8:33 AM

that makes so much sense...I am exploring if I can find someone who has done this well...If not I'll try to do it myself.

alt Hacker News

Replies