I would imagine running a local LLM for development isn’t as popular as using a hosted provider. I don’t personally host a local model, but I have shared GPUs and storage volumes with VMs and I didn’t see it as that much of a hassle. What kinds of problems are you running into?
Doesn’t ghostty also use graphics acceleration? I was under the impression that rendering text is a relatively challenging graphics compute task.
I run local LLM on my MacBook together with frontier models for different tasks. I am in the process of setting up a 3 Mac studio system to serve AI to my team.