logoalt Hacker News

visargayesterday at 2:57 PM2 repliesview on HN

> Frontier AI companies are selling at a loss.

There are huge economies to be had by batching requests and using lots of RAM for MoE (sparse models). You can't achieve that efficiency at batch size 1 on a single node.


Replies

asjiryesterday at 3:12 PM

Exactly, they put a lot of money into engineering and it does give results

eikenberryyesterday at 7:58 PM

Inference isn't the problem.