Every time I see this kind of article, no one really bothers about sb/server redundancy, load b...

mariopt • yesterday at 3:45 PM • 25 replies • view on HN

Every time I see this kind of article, no one really bothers about sb/server redundancy, load balancers, etc. are we ok with just 1 big server that may fail and bring several services down?

You saved a lot of money but you'll spend a lot of time in maintenance and future headaches.

Replies

grey-area • yesterday at 3:49 PM

It depends on the service and how critical that website is.

Sometimes it's completely acceptable that a server will run for 10 years with say 1 week or 1 month of downtime spread over those 10 years, yes. That's the sort of uptime you can see with single servers that are rarely changed and over-provisioned as many on Hetzner are. Some examples:

Small businesses where the website is not core to operations and is more of a shop-front or brochure for their business.

Hobby websites too don't really matter if they go down for short periods of time occasionally.

Many forums and blogs just aren't very important too and downtime is no big deal.

There are a lot of these websites, and they are at the lower end of the market for obvious reasons, but probably the majority of websites in fact, the long tail of low-traffic websites.

Not everything has to be high availability and if you do want that, these providers usually provide load balancers etc too. I think people forget here sometimes that there is a huge range in hosting from squarespace to cheap shared hosting to more expensive self-hosted and provisioned clouds like AWS.

➕ show 4 replies

Aurornis • yesterday at 4:01 PM

These articles are popular where there's a mismatch between application requirements and the solution chosen. When someone over-engineers their architecture to be enterprise-grade (substitute your own definition of enterprise-grade) when really they were running a hobby project or a small business where a day of downtime every once in a while just means your customers will come back the next day, going all-out on cloud architecture is maybe not necessary. That's why you see so many comments from people arguing that downtime isn't always a big deal or that risking an outage is fine: There are a lot of applications where this is kind of true.

The confusing part about this article is the emphasis on a zero-downtime migration toward a service that isn't really ideal for uptime. It wouldn't be that expensive to add a little bit of architecture on the Hetzner side to help with this. I guess if you're doing a migration and you're paid salary or your time is free-ish, doing the migration in a zero downtime way is smart. It's a little funny to see the emphasis on zero downtime juxtaposed to the architecture they chose where uptime depends on nothing ever failing

➕ show 1 reply

chillfox • yesterday at 4:08 PM

A lot of things don't need that.

Also, don't underestimate the reliability of simplicity.

I was a Linux sysadmin for many years, and I have never seen as much downtime from simpler systems as I routinely see from the more complicated setups. Somewhere between theory and reality, simpler systems just comes out ahead most of the time.

wiether • yesterday at 4:04 PM

To be fair they were using a single VM on DigitalOcean, so they didn't had the perks of a cloud provider, except maybe the fact that a VM is probably more fault-tolerant than a bare metal server.

Usually those articles describe two situations:

  - they were "on the cloud" for the wrong reasons and migrating to something more physical is the right approach
  - they were "on the cloud" for the right reasons and migrating to something more physical is going to be a disaster

Here they appear to be in the first situation. If their setup was running fine on DO and they put the right DR policies in place at Hetzner, they should be fine.

daneel_w • yesterday at 3:52 PM

They may be making this decision based on a long history of, in fact, never really having run into "a lot of time in maintenance and future headaches".

➕ show 1 reply

ahofmann • yesterday at 4:56 PM

In 20 years of hosting all kinds of web services, some of them serving over 200m requests per month, a crashing single server was twice a problem.

Dealing with over engineered bullshit, that behaved in strange ways that disrupted the service was far more often a problem.

So, yes, redundancy is something that can be left away, if you're comfortable to be responsible for fixing things at a Saturday morning.

➕ show 1 reply

supermatt • yesterday at 4:53 PM

They already were on "1 big server" (a single VPS at digital ocean) and moved to another "1 big server" (a managed server at hertzner).

They saved money and lost nothing.

Now, if they so wish, they could use a portion of that to increase redundancy - but that wasn't the point of the article.

pier25 • yesterday at 7:08 PM

I don't know about Hetzner but with Upcloud and Vultr my single VPS setups have been more reliable than multiregion with redundancy setups with other providers like Fly.

➕ show 1 reply

neya • yesterday at 4:31 PM

I was thinking the same. A managed database is just set and forget pretty much. I do NOT miss the old times where I had to monitor my email from routine security checkups hoping my database didn't get hacked by some script kiddie accompanied by blackmail over email.

chalmovsky • yesterday at 4:03 PM

What are you running on it is the only question which matters, obviously you dont want air traffic control to go down but some app… So what if it goes down? Backup is somewhere else if you even need it anyway. Github has uptime less than 90% according to this: https://mrshu.github.io/github-statuses/ . And the world keeps turning. Obviously we should strive for better, but also lets please not continue making this uptime fetish out of it, for vast majority of the apps it absolutely doesnt fucking matter.

jdboyd • yesterday at 4:37 PM

DO doesn't do high availability droplets, and their migration policy is will try, if we detect poor health of server before it fails.

If someone starts thinking about redundancy and load balancers than DO's solution is rent a second similar sized droplet, and then add their load balancing service. If you do those things with Hetzner instead, you would still be spending less than you did with Digital Ocean.

Personally, what is keeping me on DO is that no single droplet I have is large enough to justify moving on its own, and I'm not prepared to deal with moving everything.

grebc • yesterday at 9:15 PM

Downtime happens in all different contexts of life that a web site/service being knocked offline is soo far down the priority list for most people.

It’s amusing that the US government can shutdown for days/weeks/months over budget reasons and there’s no adult discussions that take place about fixing the cause. Yet the latest HN demo that 100 people will use need all 9’s reliability and hundreds of responses.

littlecranky67 • yesterday at 4:53 PM

To be fair, modern dedicated servers at hetzner have two power units, and come with a redundand ssd/hdd raid-1 config. AFAIK both ssd and power unit having hotplug capability, so in case either fails they can be replaced with zero downtime.

Given the downtimes we saw in the past year(s) (AWS, Cloudflare, Azure - the later even down several times), I would argue moving to any of the big cloud providers give you not much of a better guarantee.

I myself am a Hetzner customer with a dedicated vServer, meaning it is a shared virtual server but with dedicated CPUs (read: still oversubscribed, but some performance guarantee) and had zero hardware-based downtime for years [0]. I would guess their vservers are on similar redundant hardware where the failing components can be hotswapped.

[0] = They once within the last 3 years sent me an email that they had to update a router that would affect network connectivity for the vServer, but the notification came weeks in advance and lasted about 15 minutes. No reboot/hardware failure on my vServer though.

wg0 • yesterday at 4:32 PM

If you have the setup within server fully scripted and automated (bash, pyinfra or ansible etc) and backups are in place then recovery isn't that hard. Downtime for sure maybe couple of hours for which you can point your DNS entries to a static page while you're restoring everything.

Not a bad tradeoff for 99.8% of shops out there.

stephenhuey • yesterday at 10:01 PM

I already made a comment here about testing Hatchbox. You point it to your servers and it can set up a cluster and load balancer with a few button clicks.

ozim • yesterday at 4:22 PM

If you can restore from snapshot to a new instance on cloud provider having running second copy is waste of money.

I know people like FAANG LARPing. Not everyone has budget or need to run four nines with 24/7 and FAANG level traffic.

PunchyHamster • yesterday at 4:02 PM

their original also run on single server ?

If you can tolerate few hours of downtime and some data rollback/loss, single server + robust backups can be viable strategy

timwis • yesterday at 3:47 PM

I wondered the same! FWIW I'm currently migrating from managed postgres to self-managed on hetzner with [autobase](https://autobase.tech/). Though of course for high availability it requires more than one server.

BorisMelnik • yesterday at 4:21 PM

I agree with you, even for the servers I am responsible for I always make decisions like putting db on supabase instead of local, hosting files on s3 with versioning/multi region etc. then of course come up with a backup and snapshot system.

Gud • yesterday at 4:26 PM

What time in maintenance? Hetzner has been rock solid for me.

pinkgolem • yesterday at 4:41 PM

Tbh, my one server paperless deployment has a higher uptime then most services.

If your scaling need is not that high, you can get very far with a single server

jgalt212 • yesterday at 4:29 PM

Hetzner has cheap load balancers and VMs.

izacus • yesterday at 6:14 PM

I had like... less than 10 minutes downtime on Hetzner in years (funny enough, that makes my personal containers more reliable than productionized AWS and GCP deployments with their constant partial outages). So perhaps all that complexity (beyond maybe a backup container) isn't really necessary for companies where a bit of downtime doesn't really affect revenue?

Like, I know Leetcode tells otherwise, but most companies really don't need full FAANG stack with 99.999% uptime. A day of outage in a few years isn't going affect bottom lines.

➕ show 1 reply

surgical_fire • yesterday at 4:55 PM

The vast majority of services are actually alright with a little downtime here and there. In exchange, maintenance is a lot simpler with less moving parts.

People underestimate how far you can go with one or two servers. In fact, what I have seen in ky career is many examples of services that should have been running on one or two servers and instead went for a hugely complex microserviced approach, all in on Cloud providers and crazy requirements of reliability for a scale that never would come.

NicoJuicy • yesterday at 4:14 PM

Depends on the app and how long downtime would take.

Deploying a new docker instance or just restoring the app from a snapshot and restoring the latest db in most cases is enough.

alt Hacker News

Replies