This feels like a lot of work for low reward, the technical/business infrastructure would be wild. And if anyone wants to offload their prompts to users browsers, they might as well just use the Chrome API correctly? How many server side prompts would realistically be useful to offload to a low end model like this?
Plus even if you really wanted to do that, WebGPU exists and has for a while right?
> How many server side prompts would realistically be useful to offload to a low end model like this?
There's a lot of ways this API could go, e.g. more powerful models eventually, or perhaps integration with cloud models. For example, I could see Google trying to default Gemini as the model for users signed into Chrome
Nefarous use cases. Run that on some suckers machine.
Edit: simple example is a spam bot