I always thought some kind of block-based diffusion architecture would be the future of LLMs, especi...

2001zhaozhao • last Tuesday at 11:46 PM • 0 replies • view on HN

I always thought some kind of block-based diffusion architecture would be the future of LLMs, especially some architecture that can dynamically alter its token generation rate as well as "reason and generate at the same time", and have an opportunity to correct tokens that it has just generated. Something like the equivalent of a short term "working memory" for humans. But I have no understanding of the math. Fingers crossed.

alt Hacker News