Is this something that will show up in Ollama any time soon to increase context size of local models?
KV quantization has long been available in llama.cpp
KV quantization has long been available in llama.cpp