This is just my own understanding, but doesn't a webpage consist of a bunch of nodes, which can be combined in any way. And an html document is supposed to be a complete set of nodes, so a combination of those won't be a single document anymore.
Nodes can be addressed individually, but a document is the proportion for transmission containing also metadata. You can combined nodes as you like, but you can't really combined two already packed and annotated documents of nodes.
So I would say it is more due a semantic meaning. I think there was also the idea of requesting arbitrary sets of nodes, but that was never developed and with the shift away from a semantic document, it didn't make sense anymore.
> a webpage consist of a bunch of nodes, which can be combined in any way
More or less, but manipulating the nodes requires JavaScript, which some people would like to avoid.
I think the quickest way to say it is that there is only one head on a page, and every HTML file needs a head. So if you include one into the other, you either have two heads, or the inner document didn't have a head.