logoalt Hacker News

gerdesjyesterday at 11:31 PM7 repliesview on HN

"The biggest mistake I made was high uptime"

Quite. I'm old enough to remember machine uptime being a badge of honour.

However, being older and not really wiser, I look for service uptime these days. Yes we did have similar back in the day, that's why MX and the like DNS records exist.

Old school clusters were pretty esoteric but the lessons were learned (split brain n that) and that's why we still argue the toss with kiddies about why a Proxmox cluster with two nodes is fucked and why we recommend an additional "witness".

I don't care that VMware glossed over the whole two node HA cluster thing years ago with a massive bodge. They were wrong then and they are probably still wrong because that nonsense is probably still baked in.

Sorry, slight digression.

High uptime implies no patching. We all love patching.


Replies

andaiyesterday at 11:47 PM

https://en.wikipedia.org/wiki/Split-brain_(computing)

The more you know!

>a Proxmox cluster with two nodes is fucked and why we recommend an additional "witness".

Reminds me of the three Magi from Evangelion: https://magi.kinta.ma/

pjmlptoday at 5:47 AM

There is something like live patching.

One reason mainframes and micros are still around us, is that you can change almost everything between hardware and software without downtime.

It is also available in commercial surviving UNIXes, and as paid for feature in some Linux distros, although not to the extent that those grandparent systems are capable of.

show 5 replies
brightballtoday at 2:27 PM

In 2012 I took over a Perl project that was running on 25 BSD servers (OpenBSD I think?) that had not been updated / patched since 2000. It was an interesting time.

vablingstoday at 6:12 PM

My raspberry pi serves only to be the tiebreaker my possible split brain 2 node cluster lol. It is literally called tiebreaker

jeanlucastoday at 2:05 PM

> Yes we did have similar back in the day, that's why MX and the like DNS records exist.

Care to elaborate? I wanna know more.

show 1 reply
AdamNtoday at 8:01 AM

two is the right minimum number for a high availability dataplane but three is the right minimum number for a HA control plane.

With that said, if high availability is not a concern then 1 can be just fine.

j45today at 10:55 AM

It's pretty easy to abstract away a proxmox node into a terraform or other type of code based recipe for easy backup / reconstruction / upgrading.