logoalt Hacker News

erispoelast Monday at 1:57 PM0 repliesview on HN

I have, mostly, long running autonomous tasks, so it doesn't matter how slow inference is. If I optimize for latency it means I'm turning into the limiting factor.