I've been wondering about this failed Apple Intelligence project, but the more I think of it, A...

eagerpace • yesterday at 6:53 PM • 6 replies • view on HN

I've been wondering about this failed Apple Intelligence project, but the more I think of it, Apple can afford to sit and wait. In 5 years we're going to have Opus 4.6-level performance on-device, and Apple is the only company that stands to benefit from it. Nobody wants to be sending EVERY request to someone else's cloud server.

Replies

meatmanek • yesterday at 7:09 PM

I think there's a lot of false assumptions in that assertion:

   - that a bunch of users won't jump ship if Apple stagnates for 5 years
   - that a product based on a model with Q12026 SoTA performance would be competitive with products using 2031's models.
   - that just having access to good (by 2025/2026 standards) models is the big thing that Apple needs in order for Apple Intelligence to finally be useful.

On that last point, I think the OS/app-level features are almost more important than the model itself. If the model can't _do_ anything, it doesn't really matter how intelligent it is. If Apple sits on their laurels for 5 years, would their OS, built-in apps, and 3rd-party apps have all the hooks needed for a useful AI product?

➕ show 1 reply

well_ackshually • yesterday at 7:07 PM

Assuming the rate of progress on AI stays the same:

1/ No, you don't get Opus 4.6 level on devices with 12Gb of RAM, 7B quantised models just don't get that good. Still quite good mind you, and I believe that the biggest advance to come from mobile AI would be apps providing tools and the device providing a discovery service (see Android's AppFunctions, if it was ever documented well): output quality doesn't matter on device, really efficient and good tool calling is a game changer.

2/ Opus 4.6 is now Opus 4.6+5years and has new capabilities that make people want to keep sending everything to someone else's cloud server instead of burning their battery life

➕ show 1 reply

bitpush • yesterday at 7:04 PM

Have you tried running a reasonably sized model locally? You need minimum 24GB VRAM to load up a model. 32GB to be safe, and this isnt even frontier, but bare minimum.

A good analogy would be streaming. To get good quality, sure, you can store the video file but it is going to take up space. For videos, these are 2-4GB (lets say) and streaming will always be easier and better.

For models, we're looking at 100s of GB worth of model params. There's no way we can make it into, say, 1GB without loss in quality.

So nope, beyond minimal classification and such, on-device isnt happening.

EDIT:

> Nobody wants to be sending EVERY request to someone else's cloud server.

We do this already with streaming. You watch YouTube that is hosting videos on the "cloud". For latest MKBHD video, I dont care about having that locally (for the most part). I just wanna watch the video and be done with it.

Same with LLMs. If LLMs are here to stay, most people would wanna use the latest / greatest models.

---

EDIT-EDIT:

If you response is Apple will figure it out somehow. Nope, Apple is sitting out the AI race. So it has no technology. It has nothing. It has access to whatever open source is available or something they can license from rest. So nope, Apple isnt pushing the limits. They are watching the world move beyond them.

➕ show 11 replies

hackingonempty • yesterday at 8:47 PM

If the goal was as much to establish a trademark for "Apple Intelligence" as anything else then it wasn't a failure.

liuliu • yesterday at 7:35 PM

> and Apple is the only company that stands to benefit from it.

And that is exactly why it won't happen (like that).

candiddevmike • yesterday at 7:45 PM

How do you do on-device inference while preserving battery life?

➕ show 2 replies

alt Hacker News

Replies