As the author identifies, the idioms come from the use of system frameworks that steer you towards idiomatic implementations.
The system UI frameworks are tremendously detailed and handle so many corner cases you'd never think of. They allow you to graduate into being a power user over time.
Windows has Win32, and it was easier to use its controls than rolling your own custom ones. (Shame they left the UI side of win32 to rot)
macOS has AppKit, which enforces a ton. You can't change the height of a native button, for example.
iOS has UIKit, similar deal.
The web has nothing. You gotta roll your own, and it'll be half-baked at best. And since building for modern desktop platforms is horrible, the framework-less web is being used there too.
The web was designed for interactive documents,not desktop applications. The layout engine was inspired by typesetting (floating, block) and lot of components only make sense for text (<i>, <span>, <strong>,...). There's also no allowance for dynamic data (virtualization of lists) and custom components (canvas and svgs are not great in that regard).
> building for modern desktop platforms is horrible, the framework-less web is being used there too.
I think it's more related to PM wanting to "brand" their product and developers optimizing things for themselves (in the short term), not for their users.
> The web has nothing. You gotta roll your own, and it'll be half-baked at best. And since building for modern desktop platforms is horrible, the framework-less web is being used there too.
This feels like the root cause to me as well. Or more specifically, the web does have idioms, the problem is that those idioms are still stuck in 1980 and assume the web is a collection of science papers with hyperlinks and the occasional image, data table and submittable form.
This is where the "favourites" list and the ability to select any text on a web pages came from.
Web apps not only have to build an application UI completely from scratch, they also have to do it on top of a document UI that "wants" to do something completely different.
Modern browsers have toned down those idioms and essentially made it "easier to fight them", but didn't remove or improve them.
That’s not the only reasons. When you are used to how your operating system does things consistently, as a developer you naturally want your application to also behave like you’re used to in that environment.
This eroded on the web, because a web page was a bit of a different “boxed” environment, and completely broke down with the rise of mobile, because the desktop conventions didn’t directly translate to touch and small screens, and (this goes back to your point) the developers of mobile OSs introduced equivalent conventions only half-heartedly.
For example, long-press could have been a consistent idiom for what right-click used to be on desktop, but that wasn’t done initially and later was never consistently promoted, competing with Share menus, ellipsis menus and whatnot.
The web did have HTML and CSS, but as the author notes those have been bypassed for Web Assembly and other technologies.
Date picker and credit card entry should always always always use the default HTML controls and the browser and OS should provide the appropriate widget for every single web page. For credit cards especially the Safari implementation could tie in to the iOS Apple Wallet or Apply Pay and Android could provide the Google equivalent. This allows the platform to enforce both security policy and convenience without every developer in the world trying to get those exactly right in a non-standard way.
<button>Click me</button>
Is how you do it on the web. The problem is that it means you app will not look as good as others and that it will look different on different platforms.> You can't change the height of a native button, for example.
You can definitely do so, it's just not obvious or straightforward in many contexts.
Bootstrap was nice.
The author may have identified that "the idioms come from the use of system frameworks", but they absolutely got wrong just about everything about why apps are not consistent on the web (e.g. I was baffled by their reasons listed under "this lack of homogeneity is for two reasons" section).
First, what he calls "the desktop era" wasn't so much a desktop era as a Windows era - Windows ran the vast majority of desktops (and furthermore, there were plenty of inconsistencies between Windows and Mac). So, as you point out regarding the Win32 API, developers had essentially one way to do things, or at least the far easiest way to do things. Developers weren't so much "following design idioms" as "doing what is easy to do on Windows".
The web started out as a document sharing system, and it only gradually and organically turned over to an app system. There was simply no single default, "easiest" way to do things (and despite that, I remember when it seemed like the web converged all at once onto Bootstrap, because it became the easiest and most "standard" way to do things).
In other words, I totally agree with you. You can have all the "standard idioms" that you want, but unless you have a single company providing and writing easy to use, default frameworks, you'll always have lots of different ways of doing things.