I would love to see a purely mamba-based 120b model, and whether or not it outcompetes the open-weights OpenAI model.
I would love to see a purely mamba-based 120b model, and whether or not it outcompetes the open-weights OpenAI model.