logoalt Hacker News

DesaiAshulast Sunday at 11:51 PM2 repliesview on HN

data bandwidth limits distributed training under current architectures. really interesting implications if we can make progress on that


Replies

andoandoyesterday at 7:59 PM

What bandwith limits? Im assuming the forward and backward passes have to be done sequentially?

dogcomplexyesterday at 9:28 AM

Limits but doesn't prohibit. See https://www.primeintellect.ai/blog/intellect-3 - still useful and can scale enormously. Takes a particular shape and relies heavily on RL, but still big.