There is some recent work on modularizing knowledge in LLMs.
https://arxiv.org/html/2605.06663v1
It might be possible to train a big generalist that is a composition of modules, some of which can be dropped dynamically at inference time, depending on the prompt.