logoalt Hacker News

Why can't HTML alone do includes?

390 pointsby susam05/03/2025356 commentsview on HN

Comments

dwheeler05/03/2025

HTML was historically an application of SGML, and SGML could do includes. You could define a new "entity", and if you created a "system" entity, you could refer to it later and have it substituted in.

    <!DOCTYPE html example [
      <!ENTITY myheader SYSTEM "myheader.html">
    ]>
    ....
    &myheader;
SGML is complex, so various efforts were made to simplify HTML, and that's one of the capabilities that was dropped along the way.
show 4 replies
dimal05/03/2025

This was the rabbit hole that I started down in the late 90s and still haven’t come out of. I was the webmaster of the Analog Science Fiction website and I was building tons of static pages, each with the same header and side bar. It drove me nuts. So I did some research and found out about Apache server side includes. Woo hoo! Keeping it DRY (before I knew DRY was a thing).

Yeah, we’ve been solving this over and over in different ways. For those saying that iframes are good enough, they’re not. Iframes don’t expand to fit content. And server side solutions require a server. Why not have a simple client side method for this? I think it’s a valid question. Now that we’re fixing a lot of the irritation in web development, it seems worth considering.

show 11 replies
throwup23805/03/2025

The feature proposal was called HTML Imports [1], created as part of the Web Components effort.

> HTML Imports are a way to include and reuse HTML documents in other HTML documents

There were plans for <template> tag support and everything.

If I remember correctly, Google implemented the proposed spec in Blink but everyone else balked for various reasons. Mozilla was concerned with the complexity of the implementation and its security implications, as well as the overlap with ES6 modules. Without vendor support, the proposal was officially discontinued.

[1] https://www.w3.org/TR/html-imports/

show 6 replies
Lammy05/03/2025

Netscape 4 has this with inflow layers — `<ILAYER SRC=included.html></ILAYER>`

https://web.archive.org/web/19970630074729fw_/http://develop...

https://web.archive.org/web/19970630094813fw_/http://develop...

show 2 replies
Null-Set05/03/2025

The name of this feature is transclusion.

https://en.wikipedia.org/wiki/Transclusion

It was part of Project Xanadu, and originally considered to be an important feature of hypertext.

Notably, mediawiki uses transclusion extensively. It sometimes feels like the wiki is the truest form of hypertext.

show 2 replies
Linux-Fan05/03/2025

Isn't this what proper framesets (not iframes) were supposed to do a long time ago (HTML 4?). At least they autoexpanded just fine and the user could even adjust the size to their preference.

There was a lot of criticism for frames [1] but still they were successfully deployed for useful stuff like Java API documentation [2].

In my opinion the whole thing didn't stay mostly because of too little flexibility for designer: Framesets were probably well enough for useful information pages but didn't account for all the designers' needs with their bulky scrollbars and limited number of subspaces on the screen. Today it is too late to revive them because framesets as-is wouldn't probably work well on mobile...

[1] <https://www.nngroup.com/articles/why-frames-suck-most-of-the...> - I love how much of it is not applicable anymore and all of these problems mentioned with frames are present in today's web in an even nastier way?

[2] <https://www.eeng.dcu.ie/~ee553/ee402notes/html/figures/JavaD...>

show 1 reply
rchaud05/03/2025

"Includes" functionality is considered to be server-side, i.e. handled outside of the web browser. HTML is client-side, and really just a markup syntax, not a programming language.

As the article says, the problem is a solved one. The "includes" issue is how every web design student learns about PHP. In most CMSes, "includes" become "template partials" and are one of the first things explained in the documentation.

There really isn't any need to make includes available through just HTML. HTML is a presentation format and doesn't do anything interesting without CSS and JS anyway.

show 8 replies
socalgal205/03/2025

There are all kind of issues with HTML include as others have pointed out

If main.html includes child/include1.html and child/include1.html has a link src="include2.html" then when the user clicks the link where does it go? If it goes to "include2.html", which by the name was meant to be included, then that page is going to be missing everything else. If it goes to main.html, how does it specify this time, use include2.html, not include1.html?

You could do the opposite, you can have article1.html, article2.html, article3.html etc, each include header.html, footer.html, navi.html. Ok, that works, but now you've make it so making a global change to the structure of your articles requires editing all articles. In other words, if you want to add comments.html to every article you have to edit all articles and you're back to wanting to generate pages from articles based on some template at which point you don't need the browser to support include.

I also suspect there would be other issues, like the header wants to know the title, or the footer wants a next/prev link, which now require some way to communicate this info between includes and you're basically back to generate the pages and include not being a solution

I think if you work though the issues you'll find an HTML include would be practically useless for most use cases.

show 3 replies
uallo05/03/2025

There is an open issue about this at WHATWG (also mentioned in the comment section of the blog post):

Client side include feature for HTML

https://github.com/whatwg/html/issues/2791

austin-cheney05/03/2025

So, HTML did have includes and they fell out of favor.

The actual term include is an XML feature and it’s that feature the article is hoping for. HTML had an alternate approach that came into existence before XML. That approach was frames. Frames did much more than XML includes and so HTML never gained that feature. Frames lost favor due to misuse, security, accessibility, and variety of other concerns.

show 1 reply
simonjgreen05/03/2025

I know it’s not straight HTML, but SSI (server side includes) helped with this and back in the day made for some incredibly powerful caching solutions. You could write out chunks of your site statically and periodically refresh them in the server side, while benefitting from serving static content to your users. (This was in the pre varnish era, and before everyone was using memcached)

I personally used this to great success on a couple of Premier League football club websites around the mid 2000s.

show 1 reply
Kuyawa05/03/2025

This is the closest we can do today:

  -- index.html

  <html>
  <body>
    <script src="header.js"></script>
    <main>
      <h1>Hello includes</h1>
    </main>
    <script src="footer.js"></script>
  </body>
  </html>

  -- header.js

  document.currentScript.outerHTML = `
  <header>
    <h1>Header</h1>
  </header>`

  -- footer.js

  document.currentScript.outerHTML = `
  <footer>
    <h1>Footer</h1>
  </footer>`
Scripts will replace their tags with html producing a clean source, not pretty but it works on the client
show 3 replies
tln05/03/2025

There used to be a thing for this

https://caniuse.com/imports

show 3 replies
rorylaitila05/03/2025

I'm a full stack developer. I do server side rendering. I agree that this is a 'solved problem' for that case. However there are many times I don't want to run a server or a static site generator. I manage a lot of projects. I don't want more build steps than necessary. I just want to put some HTML on the net with some basic includes, without JavaScript. But currently I would go the web component route and accept the extra JS.

171862744005/03/2025

This is just my own understanding, but doesn't a webpage consist of a bunch of nodes, which can be combined in any way. And an html document is supposed to be a complete set of nodes, so a combination of those won't be a single document anymore.

Nodes can be addressed individually, but a document is the proportion for transmission containing also metadata. You can combined nodes as you like, but you can't really combined two already packed and annotated documents of nodes.

So I would say it is more due a semantic meaning. I think there was also the idea of requesting arbitrary sets of nodes, but that was never developed and with the shift away from a semantic document, it didn't make sense anymore.

show 2 replies
kyledrake05/03/2025

At least some of the blame here is the bias towards HTML being something that is dynamic code generated, as opposed to something that is statically handwritten by many people.

There are features that would be good for the latter that have been removed. For example, if you need to embed HTML code examples, you can use the <xmp> tag, which makes it so you don't need to encode escapes. Sadly, the HTML5 spec is trying to obsolete the <xmp> tag even though it's the only way to make this work. All browsers seem to be supporting it anyways, but once it is removed you will always have to encode the examples.

HTML spec developers should be more careful to consider people hand coding HTML when designing specifications, or at least decisions that will require JavaScript to accomplish something it probably shouldn't be needed for.

show 1 reply
evrimoztamur05/03/2025

If you want to include HTML sandboxes, we have iframes. If you want it served from the server, it's just text. Putting text A inside text B is a solved problem.

show 2 replies
Evidlo05/03/2025

You can get JS-free, client-side include functionality if you're willing to wrap your HTML in XML. Here is a demo:

https://github.com/Evidlo/xsl-website

show 2 replies
_heimdall05/03/2025

If I really need HTML includes for some reason, I'd reach for XSLT. I know its old, and barely maintained at best, but that was the layer intentionally added to add programming language features to the markup language that is HTML.

show 2 replies
bambax05/03/2025

> We’ve got <iframe>, which technically is a pure HTML solution, but they are bad for overall performance, accessibility, and generally extremely awkward here

What does this mean? This is a pure HTML solution, not just "technically" but in reality. (And before iframe there were frames and frameset). Just because the author doesn't like them don't make them non-existent.

show 3 replies
mixmastamyk05/03/2025

Lots of rationalization in here—it's always been needed. I complained about the lack of <include src="..."> when building my first site in '94/95, with simpletext and/or notepad!

It was not in the early spec, and seems someone powerful wouldn't allow it in later. So everyone else made work arounds, in any way they could. Resulting in the need being lessened quite a bit.

My current best workaround is the <object data=".."> tag, which has a few better defaults than iframe. If you put a link to the same stylesheet in the include file it will match pretty well. Size with width=100%, though with height you'll need to eyeball or use javascript.

Or, Javascript can also hoist the elements to the document level if you really need. Sample code at this site: https://www.filamentgroup.com/lab/html-includes/

show 1 reply
somethingsome05/03/2025

I'm not an expert on this but IMO, from a language point of view, HTML is a markup language, it 'must' have no logic or processing. It is there to structure the information not to dynamically change it. Nor even to display it nicely.

The logic is performed elsewhere. If you were to have includes directly in HTML, it means that browsers must implement logic for HTML. So it is not 'just' a parser anymore.

Imagine for example that I create an infinite loop of includes, who is responsible to limit me? How to ensure that all other browsers implement it in the same way?

What happens if I perform an injection from another website? Then we start to have cors policy management to write. (iframes were bad for this)

Now imagine using Javascript I inject an include somewhere, should the website reload in some way? So we have a dynamic DOM in HTML?

show 4 replies
dheera05/03/2025

Seems everyone forgot HTML-SSI which worked something like this. Many servers and hosting websites of the 90s supported it.

    <!--#include virtual="header.html" -->
    Some content here
    <!--#include virtual="footer.html" -->
mikewarot05/04/2025

The reason is simple, HTML is not a hypertext markup language. Markup is the process of adding commentary and other information on top of an existing document, and HTML is ironically incapable of doing the one thing it most definitely should be able to do.

It's so bad, that if you want to discuss the markup hypertext (I.E. putting notes on top of an existing read only text files, etc.) you'll have to Google the word "annotation" to even start to get close.

Along with C macros, Case Sensitivity, Null terminated strings, unauthenticated email, ambient authority operating systems, HTML is one of the major mistakes of computing.

We should have had the Memex at least a decade ago, and we've got this crap instead. 8(

TZubiri05/03/2025

Because it's HyperText, the main idea is that you link to other content, so this is not a weird feature that is being asked for, it's just a different way of doing the whole raison d'etre of the tech. In fact the tag to link stuff is the <a> tag. It just so happens that it makes you load the other "page", instead of transcluding content, the idea is that you load it.

It wouldn't make sense to transclude the article about the United States in the article about Wyoming (and in fact modern wikipedia shows a pop up bubble doing a partial transclusion, but would benefit in no way from basic html transclusion.)

It's a simple idea. But of course modern HTML is not at all what HTML was designed to be, but that's the canonical answer.

The elders of HTML would just tell you to make an <a> link to whatever you wanted to transclude instead. Be it a "footer/header/table of contents" or another encylcopdic article, or whatever. Because that's how HTML works, and not the way you suggest.

Think of what would happen if it were the case, you would transclude page A, which transcludes page B, and so with page C, possibly recursively transcluding page B and so. You would transform the User Agent (browser) into a whole WWW crawler!

It's because HTML is pass by reference, not pass by copy.

SJC_Hacker05/03/2025

Initially HTML was less about the presentation layer and more about the "document" concept. Documents should be self-contained, outside of references to other documents.

show 2 replies
superkuh05/04/2025

I still use server side includes. It is absolutely the best ratio of templating power to attack surface. SSI basically hasn't changed in the last 20 years and is solid in apache, nginx, etc. You can avoid all the static site generator stuff and just write pure .html files.

It should not have gone away. It never did for me.

Also, this is kind of what 'frames' were and how they were used. Everything old is new again.

jsdwarf05/03/2025

I'd say in 80% of the cases a pure, static html include is not enough. In a menu include, you want to disable the link to the currently shown page or show a page specific breadcrumb. In a footer include, you may want a dynamic "last updated" timestamp or the current year in the copyright notice. As all these use cases required a server-side scripting language anyway, there was no push behind an html include.

show 1 reply
djoldman05/03/2025

> Our developer brains scream at us to ensure that we’re not copying the exact code three times, we’re creating the header once then “including” it on the three (or a thousand) other pages.

Interesting, my brain is not this way: I want to send a minimum number of files per link requested. I don't care if I include the same text because the web is generally slow and it's generally caused by a zillion files sent and a ton of JS.

esprehn05/03/2025

We discussed this back when creating web components, but the focus quickly became about SPA applications instead of MPAs and the demand for features like this was low in that space.

I wish I would have advocated more for it though. I think it would be pretty easy to add using a new attribute on <script> since the parser already pauses there, so making something like <script transclude={url}> would likely not be too difficult.

ludwik05/03/2025

We used to have this in the form of a pair of HTML tags: <frameset> and <frame> (not to be confused with the totally separate <iframe>!). <frameset> provided the scaffolding with slots for multiple frames, letting you easily create a page made up entirely of subpages. It was once popular and, in many ways, worked quite neatly. It let you define static elements once entirely client-side (and without JS!), and reload only the necessary parts of the page - long before AJAX was a thing. You could even update multiple frames at once when needed.

From what I remember, the main problem was that it broke URLs: you could only link to the initial state of the page, and navigating around the site wouldn't update the address bar - so deep linking wasn’t possible (early JavaScript SPA frameworks had the same issue, BTW). Another related problem was that each subframe had to be a full HTML document, so they did have their own individual URLs. These would get indexed by search engines, and users could end up on isolated subframe documents without the surrounding context the site creator intended - like just the footer, or the article content without any navigation.

aquova05/04/2025

I 100% agree with the sentiment of this article. For my personal website, I write pretty much every page by hand, and I have a header and a footer on most of those pages. I certainly don't want to have to update every single page everytime I want to add a new navigation button to the top of the page. For a while I used PHP, but I was running a PHP server literally for only this feature. I eventually switched to JavaScript, but likewise, on a majority of my pages, this was the only JavaScript I had, and I wanted to have a "pure" HTML page for a multitude of reasons.

In the end, I settled on using a Caddy directive to do it. It still feels like a tacked on solution, but this is about as pure as I can get to just automatically "pasting" in the code, as described in the article.

panny05/04/2025

SVG use element can do exactly what the OP desires. SVGs can be inlined in html and html can be inlined in SVG too. I never understand why web devs learn html and then stop there instead of also learning svg which looks just like html, but with a lot more power.

masswerk05/04/2025

Fun fact: this does work with iframes:

  <ul>
    <li><a href="about.html" target="display">about</a></li>
    <li><a href="contact.html" target="display">contact</a></li>
  </ul>

  <iframe src="about.html" name="display"></iframe>
The important part is that the target iframe must have a `name` attribute (not identified by `id`.) I guess, this is a legacy of framesets & frames.

(Of course, this has all the issues of framesets, as in deep linking, accessibility, etc.)

show 1 reply
jasoncartwright05/03/2025

I made this to get around pages being cached at CDN level, but still needing to get live data...

https://github.com/jasoncartwright/clientsideinclude

DJHenk05/03/2025

My guess: no-one needs it.

Originally, iframe were the solution, like the posts mentions. By the time iframes became unfashionable, nobody was writing HTML with their bare hands anymore. Since then, people use a myriad of other tools and, as also mentioned, they all have a way to fix this.

So the only group who would benefit from a better iframe is the group of people who don't use any tools and write their HTML with their bare hands in 2025. That is an astonishing small group. Even if you use a script to convert markdown files to blog posts, you already fall outside of it.

No-one needs it, so the iframe does not get reinvented.

show 2 replies
hyperhello05/03/2025

I think it’s because it would be so easy to make a recursive page that includes itself forever. So you have to have rules when it’s okay, and that’s more complex and opaque than just programming it yourself.

kmoser05/03/2025

SHTML used to be a thing back in the 1990s: https://en.wiktionary.org/wiki/SHTML

show 1 reply
tanepiper05/04/2025

My first ever website I wrote with mod_include and .shtml - updating a website was just adding a few tags.

Also I miss framesets - with that a proper sidebar navigation was easily possible.

show 1 reply
Cort3z05/04/2025

Kind of serious question. Do we have any alternatives to html? If not, why? It’s essentially all html. Yes, browser will render svg/pdf/md and so on, but as far as I can tell, it’s not what I consider "real web" (links to other documents, support for styling, shared resources, scripting, and so on ).

I would have loved for there to be a json based format, or perhaps yaml, as an alternative to the xml- based stuff we have today.

show 1 reply
prkl05/03/2025

honestly, html can include css and javascript via link and style tags. there's no reason for it to not have an <include src="" /> tag, and let the browser parsing it fetch the content to replace it.

simultsop05/03/2025

It's a pity, of all web resources advancements, js, css, runtimes, web engines. HTML was the most stagnant aspect of it, despite the "HTML5" effing hype. My guess is they did not want to empower HTML and threaten SSR's, or solutions. I believe the bigest concern of not making a step is the damned backward compatibility. Some just wont budge to move.

show 1 reply
miragecraft05/03/2025

I too lamented the loss of HTML imports and ended up coming up with my own JavaScript library for it.

https://miragecraft.com/blog/replacing-html-imports

At the end of the day it’s not something trivial to implement at the HTML spec/parser level.

For relative links, how should the page doing the import handle them?

Do nothing and let it break, convert to absolute links, or remap it as a new relative link?

Should the include be done synchronously or asynchronously?

The big benefit of traditional server side includes is that its synchronous, thus simplifying logic for in-page JavaScript, but all browsers are trying to eliminate synchronous calls for speed, it’s hard to see them agreeing to add a new synchronous bottleneck.

Should it be CORS restricted? If it is then it blocks offline use (file:// protocol) which really kills its utility.

There are a lot of hurdles to it and it’s hard to get people to agree on the exact implementation, it might be best to leave it to JavaScript libraries.

show 2 replies
neuroelectron05/03/2025

So glad I decided early in my career to not do webpages. Look how much discussion this minor feature has generated. I did make infra tools that outputted basic html, get post cgi type of stuff. What's funny is this stuff was deployed right before AWS was launched and a year later the on prem infra was sold and the warehouse services were moved to the cloud.

show 1 reply
psychoslave05/04/2025

We have the object tag, don't we? Is there anything wrong with it?

https://www.w3.org/TR/WD-html40-970708/struct/includes.html#...

show 1 reply
JanSchu05/05/2025

Why can’t I just write <include src="header.html"> and be done with it?

We almost could. Chrome shipped a draft of HTML Imports back in 2014. You’d do exactly that, the browser would fetch the fragment, parse it, and make it available for insertion. The idea died for three reasons that still apply today:

Execution‑order and performance hazards. Images, scripts, and styles are fire‑and‑forget: the preload scanner sees a URL, starts the fetch, and the parser keeps streaming. With HTML fragments you need the full subtree before you can finish parsing the parent document (otherwise IDs, custom‑element upgrades, <script defer>, etc. fire in the wrong order). That either stalls the parser—horrible for TTFB—or forces async insertion, which produces layout shifts. Everyone hated both outcomes.

Security and isolation. If an imported fragment can run scripts it becomes an XSS foot‑gun; if it can’t run scripts it breaks a surprising amount of markup (think onerror, custom elements with module scripts, CSP inheritance, etc.). The platform already has an “HTML that can’t run scripts” container: it’s called an iframe. Anything more permissive lands in a swamp of half‑trusted execution.

The “circular dependency” tar‑pit. Templates inherit CSS scopes, custom element registries, and base URLs from the document that instantiates them. Once you let HTML pull in more HTML, those scopes can nest arbitrarily—and can link back to parents. The HTML spec team tried to spec out the edge‑cases and basically threw up their hands. (There’s a famous TAG thread titled “HTML Imports considered harmful” that reads like war diaries.)

Meanwhile developers solved the “shared header” problem higher up the stack—SSI, PHP include, SSG partials, React components, you name it—so browser vendors didn’t see a payoff big enough to justify the complexity. The attitude became: “composition is a build‑time concern, not a runtime primitive.”

Could it ever come back? Maybe, but the bar is higher now that everyone has a build step. A proposal would need to:

Stream (no parser‑blocking)

Sandbox (no ambient script execution)

Deduplicate (avoid circular fetch hell)

Play nicely with CSP, SRI, origin isolation, and the module graph

That starts to look a lot like… <iframe src="header.html" loading="eager">, which we already have—just not the ergonomic sugar we wish for.

So the short answer is: HTML includes are easy in user‑land but devilishly hard to make safe, fast, and spec‑compliant in the browser itself.

jiffygist05/04/2025

On topic: what's the absolute minimal static site generator that can achieve this feature? I know things like Pelican can do it but it's pretty heavy. C preprocessor probably can be used for this...

show 1 reply
wodenokoto05/04/2025

In the nineties we fixed it with frames or CGI. I still think of it as one of those “if it was fiction it would be unrealistic” things (although, who writes fictional markup standards?)

WhyNotHugo05/03/2025

HTML frames solved this problems just fine, but they were deprecated in favour of using AJAX to replace portions of the body as you navigate (e.g.: SPAs).

I still feel like frames were great for their use case.

🔗 View 35 more comments