logoalt Hacker News

aitchnyuyesterday at 2:13 PM1 replyview on HN

Tangential. I'm a newb, can you name the concept of partitioning weights so we dont need to load whole thing?


Replies

agunapaltoday at 10:18 AM

Do you mean model sharding?