I also wonder about JS only, Python only, etc models.
Maybe the future is a selection of local, specific stack trained models?
These models being able to generalise at coding will likely get worse if you remove high quality training data like all of python.
There is some recent work on modularizing knowledge in LLMs.
https://arxiv.org/html/2605.06663v1
It might be possible to train a big generalist that is a composition of modules, some of which can be dropped dynamically at inference time, depending on the prompt.