data bandwidth limits distributed training under current architectures. really interesting implicati...

DesaiAshu • last Sunday at 11:51 PM • 2 replies • view on HN

data bandwidth limits distributed training under current architectures. really interesting implications if we can make progress on that

Replies

andoando • yesterday at 7:59 PM

What bandwith limits? Im assuming the forward and backward passes have to be done sequentially?

dogcomplex • yesterday at 9:28 AM

Limits but doesn't prohibit. See https://www.primeintellect.ai/blog/intellect-3 - still useful and can scale enormously. Takes a particular shape and relies heavily on RL, but still big.

alt Hacker News

Replies