thanks! we explain how it scales to larger models in the last section the OP blog post
Shame you stopped short of actually benchmarking that scale though, eh?
Shame you stopped short of actually benchmarking that scale though, eh?