the LoRA on-chip SRAM angle is interesting but also where this gets hard. the whole pitch is that we...

snowhale • yesterday at 7:54 PM • 0 replies • view on HN

the LoRA on-chip SRAM angle is interesting but also where this gets hard. the whole pitch is that weights are physical transistors, but LoRA works by adding a low-rank update to those weights at inference time. so you're either doing it purely in SRAM (limited by how much you can fit) or you have to tape out a new chip for each fine-tune. neither is great. might end up being fast but inflexible -- good for commodity tasks, not for anything that needs customization per customer.

alt Hacker News