Are the Apple neural engines even a practical target of LLMs?
Maybe not strictly impossible, but ANE was designed with an earlier, pre-LLM style of ML. Running LLMs on ANE (e.g. via Core ML) possible in theory, but the substantial model conversion and custom hardware tuning required makes for a high hurdle IRL. The LLM ecosystem standardized around CPU/GPU execution, and to date at least seems unwilling to devote resources to ANE. Even Apple's MLX framework has no ANE support. There are models ANE runs well, but LLMs do not seem to be among them.
There is a project on github named ANEMLL. Was discussed here a month ago, running LLMs on iPhone - https://news.ycombinator.com/item?id=47490070
It is possible but requires a very specific model design to utilize. As this reverse engineering effort has shown [0] "The ANE is not a GPU. It’s not a CPU. It’s a graph execution engine." To build one requires using a specific pipeline specifically for CoreML [1].
[0] https://maderix.substack.com/p/inside-the-m4-apple-neural-en... [1] https://developer.apple.com/documentation/coreml