logoalt Hacker News

yjftsjthsd-htoday at 5:47 AM3 repliesview on HN

Is anyone doing any form of diffusion language models that are actually practical to run today on the actual machine under my desk? There's loads of more "traditional" .gguf options (well, quants) that are practical even on shockingly weak hardware, and I've been seeing things that give me hope that diffusion is the next step forward, but so far it's all been early research prototypes.


Replies

janalsncmtoday at 8:39 AM

I worked on it for a more specialized task (query rewriting). It’s blazing fast.

A lot of inference code is set up for autoregressive decoding now. Diffusion is less mature. Not sure if Ollama or llama cpp support it.

show 2 replies
Bolwintoday at 5:57 AM

Based on my experience running diffusion image models I really hope this isn't going to take over anytime soon. Parallel decoding may be great if you have a nice parallel gpu or npu but is dog slow for cpus

LoganDarktoday at 10:49 AM

Because diffusion models have a substantially different refining process, most current software isn't built to support it. So I've also been struggling to find a way to play with these models on my machine. I might see if I can cook something up myself before someone else does...