Because in its training data there is information on how to map from documentation of a language to actual programs. This means that following the pattern it can map between documentation for any language to programs in that language.
But I think it will have difficulty in crossing paradigm boundaries, by simply using documentation.