The 1B model works on iPhones[0].
See my other comments. anemll appears to use less memory.
[0] https://huggingface.co/anemll/anemll-llama-3.2-1B-iOSv2.0