Did some try to estimates what it would take to bake interference for a capable large language model...

danbruc • today at 3:28 PM • 1 reply • view on HN

Did some try to estimates what it would take to bake interference for a capable large language model into silicon so that one can pipeline inputs through it and produce outputs at one token per clock cycle?

Replies

knorker • today at 3:30 PM

I'd expect it to require too much RAM bandwidth to be feasible.

RAM is really slow at silicon speeds. Very little is reachable in one clock cycle, unless the clock cycle is abysmally slow.

➕ show 1 reply

alt Hacker News

Replies