How does this compare to llama.cpp in terms of performance?
MLX is a bit faster (low double digit percentage), but uses a bit more RAM. Worthwhile tradeoff for many.
MLX is a bit faster (low double digit percentage), but uses a bit more RAM. Worthwhile tradeoff for many.