I've been experimenting with running Gemma with MLX directly within my own harness: https://github.com/cjroth/mlx-harness