I'm literally working on an iOS app right now that needs to infer some input fields from free text typed by the user. Now to take into consideration typos, unstructured text (pricing, dates .. etc), I was pondering a cloud LLM or a basic local parser or even a local on-device LLM (ANE for 15+ devices and a different on-device LLM for the older models)
For the different on-device LLM, I literally went to HuggingFace and filtered by the smallest available models that can do the job, and Granite-4.0-h-1b works just fine, it corrects typos, infers dates, currencies all fields I need.
And it got me thinking how my first reflex was to rely on a cloud LLM which is waaay overkill for my need. Granted, an on-device LLM will need to be loaded on the devices on install or downloaded after the fact (which adds latency when the user needs it for the first time) but still, it's a better tradeoff than a cloud LLM.
I decided on a basic parser, and so far it seems to work fine. granted, it struggles with some words, but I just need to finetune it to have as much coverage as possible in terms of typos without triggering false positives.
A lot of developers have that reflex too and go along with it and then just pass the API costs to the customer. I could have gone that route too but turned out I don't even need an LLM for my usecase.
Apple includes a local LLM on all recent iPhones, https://developer.apple.com/documentation/foundationmodels. Seems like a bad idea to force your users to download a 3GB LLM just to parse a text field.