logoalt Hacker News

redbelltoday at 11:16 AM1 replyview on HN

Another related submission from 22 days ago : iPhone 17 Pro Demonstrated Running a 400B LLM (+700pts, +300cmts): https://news.ycombinator.com/item?id=47490070


Replies

zozbot234today at 12:03 PM

That's very impressive but it's streaming in weights from flash storage. That's not really viable in a mobile context, it will use way too much power. Smaller models are way more applicable to typical use, perhaps with mid-sized models (like the Gemma4 26A4B model) using weights offload from SSD for rare uses involving slower "pro" inference.