You can create an MCP to call out to Ollama. Then have Claude farm work out to local models where the raw power isn't required. You can then have Claude review the work from the model.
Its not 100% offline, but there is a dramatic drop in token usage. As long as you can put up with the speed.