> 422 network requests and 49 megabytes of data
Just FYI how this generally works: it's not developers who add it, but non-technical people.
Developers only add a single `<script>` in the page, which loads Google Tag Manager, or similar monstrosity, at the request of someone high up in the company. Initially it loads ~nothing, so it's fine.
Over time, non-technical people slap as many advertising "partner" scripts they can in the config of GTM, straight to prod without telling developers, and without thinking twice about impact on loading times etc. All they track is $ earned on ads.
(It's sneaky because those scripts load async in background so it doesn't immediately feel like the website gets slower / more bloated. And of course, on a high end laptop the website feels "fine" compared to a cheap Android. Also, there's nothing developers can do about those requests, they're under full the control of all those 3rd-parties.)
Fun fact: "performance" in the parlance of adtech people means "ad campaign performance", not "website loading speed". ("What do you mean, performance decreased when we added more tracking?")
Author here. Woke up in today to see this on the front page, thank you to the person who submitted it! Initially, my biggest fear was the HN "Hug of Death" taking it down. Happily, Cloudflare's edge caching absorbed 19.24 GB of bandwidth in a few hours with a 98.5% cache hit ratio, so the origin server barely noticed.
The discussions here about DNS-level blocking and Pi-hole are spot on. It's interesting that the burden of a clean reading experience is slowly being offloaded to the user's network stack.
These days the NYT is in a race to the bottom. I no longer even bother to bypass ads let alone read the news stories because of its page bloat and other annoyances. It's just not worth the effort.
Surely news outlets like the NYT must realize that savvy web surfers like yours truly when encountering "difficult" news sites—those behind firewalls and or with megabytes of JavaScript bloat—will just go elsewhere or load pages without JavaScript.
We'll simply cut the headlines from the offending website and past it into a search engine and find another site with the same or similar info but with easier access.
I no longer think about it as by now my actions are automatic. Rarely do I find an important story that's just limited to only one website, generally dozens have the story and because of syndication the alternative site one selects even has identical text and images.
My default browsing is with JavaScript defaulted to "off" and it's rare that I have to enable it (which I can do with just one click).
I never see Ads on my Android phone or PC and that includes YouTube. Disabling JavaScript on webpages nukes just about all ads, they just vanish, any that escape through are then trapped by other means. In ahort, ads are optional. (YouTube doesn't work sans JS, so just use NewPipe or PipePipe to bypass ads.)
Disabling JavaScript also makes pages blindingly fast as all that unnecessary crap isn't loaded. Also, sans JS it's much harder for websites to violate one's privacy and sell one's data.
Do I feel guilty about skimming off info in this manner? No, not the slightest bit. If these sites played fair then it'd be a different matter but they don't. As they act like sleazebags they deserve to be treated as such.
My family's first broadband internet connection, circa 2005, came with a monthly data quota of 400 MB.
The fundamental problem of journalism is that the economics no longer works out. Historically, the price of a copy of a newspaper barely covered the cost of printing; the rest of the cost was covered by advertising. And there was an awful lot of advertising: everything was advertised in newspapers. Facebook Marketplace and Craigslist were a section of the newspaper, as was whichever website you check for used cars or real estate listings. Journalism had to be subsidised by advertising, because most people aren't actually that interested in the news to pay the full cost of quality reporting; nowadays, the only newspapers that are thriving are those that aggressively target those who have an immediate financial interest in knowing what's going on: the Financial Times, Bloomberg, and so on.
The fact is that for most people, the news was interesting because it was new every day. Now that there is a more compelling flood of entertainment in television and the internet, news reporting is becoming a niche product.
The lengths that news websites are going to to extract data from their readers to sell to data brokers is just a last-ditch attempt to remain profitable.
I just loaded the nytimes.com page as an experiment. The volume of tracking pixels and other ad non-sense is truly horrifying.
But at least in terms of the headline metric of bandwidth, it's somewhat less horrifying. With my ad-blocker off, Firefox showed 44.47mb transferred. Of that 36.30mb was mp4 videos. These videos were journalistic in nature (they were not ads).
So, yes in general, this is like the Hindenburg of web pages. But I still think it's worth noting that 80% of that headline bandwidth is videos, which is just part of the site's content. One could argue that it is too video heavy, but that's an editorial issue, not an engineering issue.
I also use and like the comparison in units of Windows 95 installs (~40MB), which is also rather ironic in that Win95 was widely considered bloated when it was released.
While this article focuses on ads, it's worth noting that sites have had ads for a long time, but it's their obnoxiousness and resource usage that's increased wildly over time. I wouldn't mind small sponsored links and (non-animated!) banners, but the moment I enable JS to read an article and it results in a flurry of shit flying all over the page and trying to get my attention, I leave promptly.
Not only are loading times and total network usage ridiculous, sites will continue to violate your privacy via trackers and waste your CPU even when background idling. I've written about these issues a few times in the last few years, so just sharing for those interested:
A comparison of CPU usage for idling popular webpages: https://ericra.com/writing/site_cpu.html
Regarding tracker domains on the New Yorker site: https://ericra.com/writing/tracker_new_yorker.html
Allowing scripting on websites (in the mid-90s) was a completely wrong decision. And an outrage. Programs are downloaded to my computer and executed without me being able to review them first—or rely on audits by people I trust. That’s completely unacceptable; it’s fundamentally flawed. Of course, you disable scripts on websites. But there are sites that are so broken that they no longer work properly, since the developers are apparently so confused that they assume people only view their pages with JavaScript enabled.
It would have been so much better if we had simply decided back in the ’90s that executable programs and HTML don’t belong together. The world would be so much better today.
This is just the top of the iceberg. Don't get me started on airlines websites (looking at you Air Canada), where the product owner, designers, developers are not able to get a simple workflow straight without loading Mb of useless javascript and interrupt the user journey multiple times. Give me back the command line terminal like Amadeus, that would be perfect.
How can we go back to a Web where websites are designed to be used by the user and not for the shareholders?
I started writing in a Dioxus (rust framework) style. max 1KB of js code. Sending the diff via WebSocket from the rust server, and , what is more important, all code is now on a server, and because of websocket and rust it executes almost same speed as on the client. Back to normal pages sizes. And, of course, virtual scrolling everywhere.
Modern web dev is ridiculous. Most websites are an ad ridden tracking hellacape. Seeing sites like hn where lines of js are taken seriously is a godsend. Make the web less bloated.
This is why people continue to lament Google Reader (and RSS in general): it was a way to read content on your own terms, without getting hijacked by ads.
Author forgot to mention scroll hijacking on their list. This is one of the worst offenses.
I remember in 2008, when Wizards of the Coast re-launched the official Dungeons & Dragons website to coincide with the announcement of the fourth edition rules. The site was something in the region of 4 MB, plus a 20 MB embedded video file. A huge number of people were refreshing the site to see what the announcement was, and it was completely slammed. Nobody could watch the trailer until they uploaded it to YouTube later.
4 MB was an absurd size for a website in 2008. It's still an absurd size for a website.
> users are greeted by what I call Z-Index Warfare
Nice term!
> Or better yet, inject the newsletter signup as a styled, non-intrusive div between paragraphs 4 and 5. If the user has scrolled that far, they are engaged.
They're engaged with the content! There is no way to make some irrelevant signup "non-intrusive". It's similar to links to unrelated articles - do you want users to actually read the article or jump around reading headlines?
This rubbish also exists disproportionately for recipe pages/cooking websites as well.
You have 20 ads scattered around, an autoplaying video of some random recipe/ad, 2-3 popups to subscribe, buy some affiliated product and then the author's life story and then a story ABOUT the recipe before I am able to see the detailed recipe in the proper format.
It's second nature to open all these websites in reader mode for me atp.
You want to know why so many people either jump straight to comments or use alternate sources (archive, llms)? Because if you load the actual site, it freaking blows to use the damn thing.
So much hostile user design.
Edit: NPR gets a little shout out for being able to close their annoying pop-ups by clicking anywhere that's not the notification. So it's still crappy that it hijacks the screen, but not awful I guess?
It's really hard to consider any kind of web dev as "engineering." Outcomes like this show that they don't have any particular care for constraints. It's throw-spaghetti-at-the-wall YOLO programming.
The layout of news sites peaked with cnn.com back in the 1998 to 2002 timeframe. It's been downhill ever since.
It's almost criminal that the article does not mention network-wide DNS blocklists as an obvious solution to this problem. I stop nearly 100% of ads in their tracks using the Hagezi ultimate list, and run uBlock on desktop for cosmetic filtering and YouTube.
I should really run some to tests to figure out how much lighter the load on my link is thanks to the filter.
I also manually added some additional domains (mostly fonts by Google and Adobe) to further reduce load and improve privacy.
> I don't know where this fascination with getting everyone to download your app comes from.
Apps don't have adblockers.
One of the things I don't get is the economics of these trackers.
Someone is serving this amount of data to every visitor. Even if you want to track the user as much as possible, wouldn't it make sense to figure out how to do that with the least amount of data transfer possible as that would dramatically reduce your operating cost?
Perhaps size optimization is the next frontier for these trackers.
It would be less hypocritical if that critique of the situation wasn't posted on a website that itself loads unnecessary 3rd party resources (e.g. cloudflare insights).
Luckily I use a proper content blocker (uBlock Origin in hard mode).
When working at the BBC in the late 90s, the ops team would start growling at you if a site's home page was over 70kb...
The newest thing is “please wait a minute while Cloudflare decides you’re not a bot.” So you sit through that, then deal with the GPDR banner, then you get to watch an ad.
meanwhile everyone tells me i have to shave every KB off my web app
I was thinking about creating charts of shame for this across some sites. Is there some browser extension that categorizes the data sources and requests like in a pie chart or table? Tracking, ad media, first party site content...? Would be nice with a piled bar chart with piles colorized by data category.
Maybe you'd need one chart for request counts (to make tracking stand out more) and another for amount of transferred data.
Recently I've read a survey that claimed that one of the websites they reviewed shipped 50 mg of CSS.
I started on this when project when I was at The New Yorker. I had just manage to convince people to give us space to do web performance optimization - and then we had to drop it quickly to work on AMP. Very frustrating.
This site was created to give developers and pms some ammunition to work on improving load speed
Here's something I wrote in 2021:
> Today, there's ~30 times more js than html on homepages of websites (from a list of websites from 5 years ago).
It seems that this number only go up.
I think it's a GOOD thing, actually. Because all these publications a dying anyway. And even if your filter out all the ad and surveillance trash, you are left with trash propaganda and brain rot content. Like why even make the effort of filtering out the actual text from some "journalist" from these propaganda outlets. It's not even worth it.
If people tune out only because how horrible the sites are, good.
rule #1 is to always give your js devs only core 2 quad cpus + 16GB of RAM
they won't be able to complain about low memory but their experience will be terrible every time they try to shove something horrible into the codebase
This is about to get substantially worse as companies introduce more AI into their workflows.
I'm thinking that I gonna start making all my webpages <1mb in size, and compensate that with adding windows 95 to each page load.
Even enterprise COTS products can have some of these issues. We have an on-premise Atlassian suite, and Jira pages sometimes have upwards of 30MB total payloads for loading a simple user story page — and keep in mind there is no ad-tech or other nonsense going on here, it’s just pure page content.
In the same vein, https://512kb.club/ is a user-submitted website that features content under 512 KB in size! (blogs, news, etc.)
>I don't know where this fascination with getting everyone to download your app comes from.
So they could do exactly what they are doing on the web and may be even more but with Native code so it feels much faster.
I got to the point and wonder why cant all the tracking companies and ad network just all share and use the same library.
But on Web page bloat. Let's not forget Apps are insanely large as well. 300 - 700MB for Banking, Traveling or other Shopping App. Even if you cut 100MB on L10n they are still large just because of again tracking and other things.
The article says "I don't know where this fascination with getting everyone to download your app comes from."
The answer is really simple and follows on from this article; the purpose of the app is even more privacy violation and tracking.
Yes, it's 100% horrible. For me the solution is simple. If I click a link and the page is covered in ads and popup videos I CLOSE THE PAGE!!!!
Vote with your behavoir. Stop going to these sites!
I was really surprised when I went to book a flight on Frontier (don't judge me!) and a request from analytics.tiktok.com loaded. I have a lot of discomfort about that. Bloat and surveillance go hand in hand.
Oh yeah, that old topic. We’ve already discussed this back when text-heavy websites started reaching megabyte sizes. So I’m going to go look for the posts in this thread that try to explain and defend that. I’m especially looking forward to the discussions about whether ad blocking is theft or morally reprehensible. If those are still around.
This site more or less practices what it preaches. `newsbanner.webp` is 87.1KB (downloaded and saved; the Network tab in Firefox may report a few times that and I don't know why); the total image size is less than a meg and then there's just 65.6KB of HTML and 15.5 of CSS.
And it works without JavaScript... but there does appear to be some tracking stuff. A deferred call out to Cloudflare, a hit counter I think? and some inline stuff at the bottom that defers some local CDN thing the old-fashioned way. Noscript catches all of this and I didn't feel like allowing it in order to weigh it.
This is why I have a pi-hole and a selection of addons for my browser.
I am considering moving to technitium though, it seems better featured.
Our developers managed to run around 750MB per website open once.
They have put in ticket with ops that the server is slow and could we look at it. So we looked. Every single video on a page with long video list pre-loaded a part of it. The single reason the site didn't ran like shit for them is coz office had direct fiber to out datacenter few blocks away.
We really shouldn't allow web developers more than 128kbit of connection speed, anything more and they just make nonsense out of it.