> Making Kubernetes good is inherently impossible, a project in putting (admittedly high quality)...

dajonker • yesterday at 7:31 AM • 46 replies • view on HN

> Making Kubernetes good is inherently impossible, a project in putting (admittedly high quality) lipstick on a pig.

So well put, my good sir, this describes exactly my feelings with k8s. It always starts off all good with just managing a couple of containers to run your web app. Then before you know it, the devops folks have decided that they need to put a gazillion other services and an entire software-defined networking layer on top of it.

After spending a lot of time "optimizing" or "hardening" the cluster, cloud spend has doubled or tripled. Incidents have also doubled or tripled, as has downtime. Debugging effort has doubled or tripled as well.

I ended up saying goodbye to those devops folks, nuking the cluster, booted up a single VM with debian, enabled the firewall and used Kamal to deploy the app with docker. Despite having only a single VM rather than a cluster, things have never been more stable and reliable from an infrastructure point of view. Costs have plummeted as well, it's so much cheaper to run. It's also so much easier and more fun to debug.

And yes, a single VM really is fine, you can get REALLY big VMs which is fine for most business applications like we run. Most business applications only have hundreds to thousands of users. The cloud provider (Google in our case) manages hardware failures. In case we need to upgrade with downtime, we spin up a second VM next to it, provision it, and update the IP address in Cloudflare. Not even any need for a load balancer.

Replies

adamtulinius • yesterday at 7:45 AM

If you spin up Kubernetes for "a couple of containers to run your web app", I think you're doing something wrong in the first place, also coupled with your comment about adding SDN to Kubernetes.

People use Kubernetes for way too small things, and it sounds like you don't have the scale for actually running Kubernetes.

➕ show 15 replies

eddythompson80 • yesterday at 8:06 AM

And those devops folks just let your single debian VM be? It sounds like you have, like many of us, an organizational/people problem, not a k8s problem.

Maybe those devops folks only pay attention to k8s clusters and you're flying under their radar with your single debian VM + Kamal. But the same thinking that results in an overtly complex, impossible to debug, expensive to run k8s cluster can absolutely result in the same using regular VMs unless, again, you are just left to your own devices because their policies don't apply to VMs, yet.

The problem usually is you're one mistake away from someone shoving their nose in it. "What are you doing again? What about HA and redundancy? slow rollout and rollback? You must have at least 3 VMs (ideally 5) and can't expose all VMs to the internet of course. You must define a virtual network with policies that we can control and no wireguard isn't approved. You must split the internet facing load balancer from the backend resources and assign different identities with proper scoping to them. Install these 4 different security scanners, these 2 log processors, this watchdog and this network monitor. Are you doing mtls between the VMs on the private network? what if there is an attacker that gains access to your network? What if your proxy is compromised? do you have visibility into all traffic on the network? everything must flow throw this appliance"

➕ show 1 reply

psviderski • yesterday at 9:08 AM

A single VM is indeed the most pragmatic setup that most apps really need. However I still prefer to have at least two for little redundancy and peace of mind. It’s just less stressful to do any upgrades or changes knowing there is another replica in case of a failure.

And I’m building and happily using Uncloud (https://github.com/psviderski/uncloud) for this (inspired by Kamal). It makes multi-machine setups as simple as a single VM. Creates a zero-config WireGuard overlay network and uses the standard Docker Compose spec to deploy to multiple VMs. There is no orchestrator or control plane complexity. Start with one VM, then add another when needed, can even mix cloud VMs and on-prem.

➕ show 3 replies

yard2010 • yesterday at 7:52 AM

I don't get it, I think that k8s is the best software written since win95. It redefines computing in the same way IMHO. I have some experience in working with k8s on prod and I loved every moment of it. I'm definitely missing something.

➕ show 8 replies

bfivyvysj • yesterday at 7:55 AM

I thought we collectively learned this with stack overflows engineering blog years ago.

Scale vertically until you can't because you're unlikely to hit a limit and if you do you'll have enough money to pay someone else to solve it.

Docker is amazing development tooling but it makes for horrible production infrastructure.

➕ show 2 replies

sibellavia • yesterday at 8:00 AM

Clearly, Kubernetes wasn’t the right solution for your case, and I also agree that using it for smaller architectures is overkill. That said, it’s the standard for large-scale production platforms that need reproducibility and high availability. As of today I don’t see many *truly* viable alternatives and honestly I haven't even seen them.

abdjdoeke • yesterday at 10:19 AM

I dunno the more people dig into this approach they will probably end up just reinventing Kubernetes.

I use k3s/Rancher with Ansible and use dedicated VMs on various providers. Using Flannel with wireguard connects them all together.

This I think is reasonable solution as the main problem with cloud providers is they are just price gouging.

➕ show 1 reply

serbrech • yesterday at 8:56 AM

Yes, I mean, I’m an engineer on a cloud Kubernetes service, and I don’t run Kubernetes for my home services. I just run podman quadlets (systems units). But that is entirely different from an enterprise scale setup with monitoring, alerting, and scale in mind…

➕ show 2 replies

ajayvk • today at 12:27 AM

Kubernetes offers powerful low-level primitives that can support virtually any deployment architecture. However, working with these primitives directly requires significant YAML wrangling. It makes sense to build specialized solutions on top of Kubernetes that simplify common deployment patterns. Knative is one such solution. Any solution that tries to expose all underlying primitives will inevitably become as complex as Kubernetes itself.

I have been building https://github.com/openrundev/openrun, which provides a declarative solution to deploy internal web apps for teams (with SAML/OAuth and RBAC). OpenRun runs on a single-machine with Docker or it can deploy apps to Kubernetes.

a34729t • yesterday at 3:32 PM

As the strongest engineer I ever worked with commented: "Across multiple FAANG-adjacent companies, I've never seen a k8s migration go well and not require a complete reimplementation of k8s behind the APIs."

➕ show 1 reply

ferngodfather • yesterday at 8:15 AM

Cloud providers have put a lot of time and effort into making you believe every web app needs 99.9999% availability. Making you pay for auto scaled compute, load balancers, shared storage, HA databases, etc, etc.

All of this just adds so much extra complexity. If I'm running Amazon.com then sure, but your average app is just fine on a single VM.

➕ show 3 replies

germandiago • yesterday at 5:56 PM

I have designed a backend with exactly the same underlying philosophy as you ended up: load balancer? Oh, a problem. So better client-side hashing and get rid of a discovery service via a couple dns tricks already handled elsewhere robustly.

I took it to its maximum: every service is a piece that can break ---> fewer pieces, fewer potential breakages.

When I can (which is 95% of the time, I add certain other services inside the processed themselves inside the own server exes and make them activatable at startup (though I want all my infra not to drift so I use the same set of subservices in each).

But the idea is -- the fewer services, the fewer problems. I just think, even with the trade-offs, it is operationally much more manageable and robust in the end.

stasge • today at 1:00 AM

The problem is not Kubernetes but how it's treated. From its inception I've been seeing two anti-patterns: treating it as a platform (and being frustrated for Kubernetes not meeting expectations) and treating it as a product or part of a product (investing heavily into its customization and making it a dependency). Neither is practical unless you are building a platform and it is your product. Otherwise it should be viewed as an OS and treated as a commodity. You create a single big VM with MicroK8s per project (zero-ops vanilla Kubernetes) and make no dependency on how exactly Kubernetes is setup. This way you can run the same setup locally and in a data center. If ever needed your app could be moved to any cloud as long as that cloud meets basic prerequisites (like presence of persistent storage or load balancer). The best part is Kubernetes (unlike traditional OS) is API driven and your apps could be nicely packaged and managed using Terraform/OpenTofu or similar tooling.

bedobi • yesterday at 3:08 PM

At a previous job, our build pipeline

* Built the app (into a self contained .jar, it was a JVM shop)

* Put the app into a Ubuntu Docker image. This step was arguably unnecessary, but the same way Maven is used to isolate JVM dependencies ("it works on my machine"), the purpose of the Docker image was to isolate dependencies on the OS environment.

* Put the Docker image onto an AWS .ami that only had Docker on it, and the sole purpose of which was to run the Docker image.

* Combined the AWS .ami with an appropriately sized EC2.

* Spun up the EC2s and flipped the AWS ELBs to point to the new ones, blue green style.

The beauty of this was the stupidly simple process and complete isolation of all the apps. No cluster that ran multiple diverse CPU and memory requirement apps simultaneously. No K8s complexity. Still had all the horizontal scaling benefits etc.

PunchyHamster • yesterday at 9:01 AM

Well, you used a tank to plow a field then complained about maintenance and fuel usage.

If you have actual need to deploy few dozen services all talking with eachother k8s isn't bad way to do it, it has its problems but it allows your devs to mostly self-service their infrastructure needs vs having to process ticket for each vm and firewall rules they need. That is saying from perspective of migrating from "old way" to 14 node actual hardware k8s cluster.

It does make debugging harder as you pretty much need central logging solution, but at that scale you want central logging solution anyway so it isn't big jump, and developers like it.

Main problem with k8s is frankly nothing technical, just the "ooh shiny" problem developers have where they see tech and want to use tech regardless of anything

goosejuice • yesterday at 3:45 PM

I started using GKE at a seed stage company in 2017. It's still going fine today. I had zero ops experience and I found it rather intuitive. We brought in istio for mtls and outbound traffic policies and that worked pretty well too. I can only remember one fairly stressful outage caused by the control plane but it ended up remedying itself. I would certainly only use a managed k8s.

So I guess I'm a fan. I use a monolith for most of my stuff if I have the choice, but if I'm working somewhere or on something where I have to manage a bunch of services I'm most certainly going to reach for k8s.

elAhmo • yesterday at 9:58 AM

Not advocating for complexity or k8s, but if your workflow can be served by a single VM, then you are magnitudes away from the volume and complexity that would push you to have k8s setup and there is even no debate of it.

There are situations where a single VM, no matter how powerful is, can do the job.

zerkten • yesterday at 1:15 PM

>> Then before you know it, the devops folks have decided that they need to put a gazillion other services and an entire software-defined networking layer on top of it.

I don't work that closely with k8s, but have toyed with a cluster in my homelab, etc. Way back before it really got going, I observed some OpenStack folks make the jump to k8s.

Knowing what I knew about OpenStack, that gave me an inkling that what you describe would happen and we'd end up in this place where a reasonable thing exists but it has all of this crud layered on top. There are places where k8s makes sense and works well, but the people surrounding any project are the most important factor in the end result.

Today we have an industry around k8s. It keeps a lot of people busy and employed. These same folks will repeat k8s the next time, so the best thing people that who feel they have superior taste is to press forward with their own ideas as the behavior won't change.

jkukul • yesterday at 12:43 PM

> I ended up saying goodbye to those devops folks,

The irony is that "DevOps" was supposed to be a culture and a set of practices, not a job title. The tools that came with it (=Kubernetes) turned out to be so complex that most developers didn't want to deal with them and the DevOps became a siloed role that the movement was trying to eliminate.

That's why I have an ick when someone uses devops as a job title. Just say "System Admin" or "Infrastrcutre Engineer". Admit that you failed to eliminate the siloes.

➕ show 1 reply

dgb23 • yesterday at 8:14 AM

> Then before you know it, the devops folks have decided that they need to put a gazillion other services and an entire software-defined networking layer on top of it.

I'm not familiar with kubernetes, but doesn't it already do SDN out of the box?

➕ show 1 reply

huijzer • yesterday at 9:45 PM

I always looked into k8s and then realized it solves YouTube-scale problems, which I don’t have.

simonebrunozzi • yesterday at 4:14 PM

> nuking the cluster, booted up a single VM with debian, enabled the firewall and used Kamal to deploy the app with docker.

Absolutely brilliant. Love it.

m4ck_ • yesterday at 11:30 AM

And if you need a cluster, Hashicorp Nomad seems like a more reasonable option than full blown kubernetes. I've never actually used it in prod, only a lab, but I enjoyed it.

➕ show 1 reply

dnnddidiej • yesterday at 12:31 PM

That is good but at bigger orgs with massive workloads and the teams to build it out k8s makes sense. It is a standard and brilliant tech.

dobreandl • yesterday at 12:33 PM

We've reduced our costs on Hetzner to about 10% on what we've paid on Heroku, for 10x performance. Kamal really kicks ass, and you can have a pretty complicated infrastructure up in no time. We're using terraform, ansible + kamal for deploys, no issues whatsoever.

➕ show 1 reply

BirAdam • yesterday at 11:42 AM

So... if you're at the point where you're using a single VM, I have to ask why bother with docker at all? You're paying a context switch overhead, memory overhead, and disk overhead that you do not need to. Just make an image of the VM in case you need to drop it behind an LB.

➕ show 3 replies

tdrz • yesterday at 7:06 PM

I'm very happy with my k8s setup for my small startup. I believe it would have been much harder for me to get it off the ground, manage it etc. without it.

marcosscriven • yesterday at 8:04 AM

First time I’ve heard of Kamal. Looks ideal!

Do you pair it with some orchestration (to spin up the necessary VM)?

ricardo_lien • yesterday at 9:46 AM

Yes, I've had similar experiences. My life has been much easier since I migrated to ECS Fargate - the service just works great. No more 2AM calls (at least not because of infra incidents), no more cost concerns from my boss.

wernerb • yesterday at 7:57 AM

DevOps lost the plot with the Operator model. When it was being widely introduced as THE pattern I was dismayed. These operators abstract entirely complex services like databases behind yaml and custom go services. When going to kubecon i had one guy tell me he collects operators like candy. Answers on Lifecycle management, and inevitable large architectural changes in an ever changing operator landscape was handwaved away with series of staging and development clusters. This adds so much cost.. Fundamentally the issue is the abstractions being too much and entirely on the DevOps side of the "shared responsibility model". Taking an RDBMS from AWS of Azure is so vastly superior to taking all that responsibility yourself in the cluster.. Meanwhile (being a bit of an infrastructure snob) I run Nixos with systemd oci containers at home. With AI this is the easiest to maintain ever.

➕ show 1 reply

1dom • yesterday at 7:59 AM

I think this comment and replies capture the problem with Kubernetes. Nobody gets fired for choosing Kubernetes now.

It's obvious to you, me and the other 2 presumably techie people who've responded within 15 mins that you shouldn't have been using Kubernetes. But you probably work in a company of full of techie people, who ended up using Kubernetes.

We have HN, an environment full of techie people here who immediately recognise not to use k8s in 99% of cases, yet in actually paid professional environments, in 99% of cases, the same techie people will tolerate, support and converge on the idea they should use k8s.

I feel like there's an element of the emperors new clothes here.

BowBun • yesterday at 1:57 PM

What scale is this story operating at? My experience managing a fleet of services is that my job would take 10x as long without k8s. It's hard, not bad.

collimarco • yesterday at 10:58 AM

Kubernetes is not bad, it's just low level. Most applications share the exact same needs (proof: you could run any web app on a simple platform like Heroku). That's why some years ago I built an open source tool (with 0 dependencies) that simplify Kubernetes deployments with a compact syntax which works well for 99% of web apps (instead of allowing any configuration, it makes many "opinionated" choices): https://github.com/cuber-cloud/cuber-gem I have been using it for all the company web apps and web services for years and everything works nicely. It can also auto scale easily and that allows us to manage huge spikes of traffic for web push (Pushpad) at a reasonable price (good luck if you used a VM - no scaling - or if you used a PaaS - very high costs).

➕ show 1 reply

robshep • yesterday at 7:38 AM

If you replaced k8s with a single app on a single VM then you’ve taken a hype fuelled circuitous route to where you should have been anyway.

gregdelhon • yesterday at 10:10 AM

Not so surprised that the architecture approach pushed by cloud vendors are... increasing cloud spend!

mynameisash • yesterday at 4:55 PM

My first and really only experience with Kubernetes was a project I did about six years ago. I was tasked with building a thing that did some lightly distributed compute using Python + Dask. I was able to cobble together a functioning (internal) product, and we went to production.

Not long after, I found that the pods were CONSTANTLY getting into some weird state where K8s couldn't rebuild, so I had to forcibly delete the pods and rebuild. I blamed myself, not knowing much about K8s, but it also was extremely frustrating because, as I understood/understand it, the entire purpose of Kubernetes is to ensure a reliable deployment of some combination of pods. If it couldn't do that and instead I had to manually rebuild my cluster, then what was the point?

In the end, I ended up nuking the entire project -- K8s, Docker containers, Python, and Dask -- and instead went with a single Rust binary deployed to an Azure Function. The result was faster (by probably an order of magnitude), less memory, cheaper (maybe -80% cost), and much more reliable (I think around four nines).

whalesalad • yesterday at 11:27 AM

Your use case is very small and simple. Of course a single VM works. You’re changing a literal A record at CF to deploy confirms this.

That is not what kube is designed for.

Melatonic • yesterday at 11:45 PM

microVM's are going to make all of these redundant

KaiserPro • yesterday at 6:48 PM

> and an entire software-defined networking layer on top of it.

This is one of the main fuckups of k8s, the networking is batshit.

The other problems is that secrets management is still an afterthought.

The thing that really winds me up is that it doesn't even scale up that much. 2k nodes and it starts to really fall apart.

throwaway894345 • yesterday at 3:16 PM

This feels like the microservices versus monolith problem. You can use cloud services or not, and that's orthogonal to running your app in Kubernetes or in a VM.

Similarly, I suspect (based on your "hardening" grievance) that a lot of your tedium is just that cloud APIs generally push you toward least-privileges with IAM, which is tedious but more secure. And if you implement a comparably secure system on your single VM (isolating different processes and ensuring they each have minimal permissions, firewall rules, etc) then you will probably have strictly more incidents and debugging effort. But you could go the other way and make a god role for all of your services to share and you will spend much less time debugging or dealing with incidents.

Even with a single VM, you could throw k3s on it and get many of the benefits of Kubernetes (a single, unified, standardized, extensible control plane that lots of software already supports) rather than having to memorize dozens of different CLI utilities, their configuration file formats, their path preferences, their logging locations, etc. And as a nice bonus, you have a pretty easy path toward high availability if you decide you ever want your software to run when Google decides to upgrade the underlying hardware.

j45 • yesterday at 3:08 PM

There exists a sweet spot between docker swarm and docker, not quite portainer, but a bit more.

The tools in this space can really help get a few containers in dev/staging/production much more manageable.

znpy • yesterday at 9:32 AM

> It always starts off all good with just managing a couple of containers to run your web app. Then before you know it, the devops folks have decided that they need to put a gazillion other services and an entire software-defined networking layer on top of it.

As a devops/cloud engineer coming from a pure sysadmin background (you've got a cluster of n machines running RHEL and that's it) i feel this.

The issues i see however are of different nature:

1. resumeè-driven development (people get higher-paying job if you have the buzzwords in your cv)

2. a general lack of core-linux skills. people don't actually understand how linux and kubernetes work, so they can't build the things they need, so they install off-the-shelf products that do 1000 things including the single one they need.

3. marketing, trendy stuff and FOMO... that tell you that you absolutely can't live without product X or that you must absolutely be doing Y

to give you an example of 3: fluxcd/argocd. they're large and clunky, and we're getting pushed to adopt that for managing the services that we run inside the cluster (not developer workloads, but mostly-static stuff like the LGTM stack and a few more things - core services, basically). they're messy, they add another layer of complexity, other software to run and troubleshoot, more cognitive load.

i'm pushing back on that, and frankly for our needs i'm fairly sure we're better off using terraform to manage kubernetes stuff via the kubernetes and helm provider. i've done some tests and frankly it works beautifully.

it's also the same tool we use to manage infrastructure, so we get to reuse a lot of skills we already have.

also it's fairly easy to inspect... I'm doing some tests using https://pkg.go.dev/github.com/hashicorp/hcl/v2/hclparse and i'm building some internal tooling to do static analysis of our terraform code and automated refactoring.

i still think kubernetes is worth the hassle, though (i mostly run EKS, which by the way has been working very good for me)

holoduke • yesterday at 3:47 PM

And nowadays with Claude you can spin up clusters of vps machines in a few hours. All bare Debian without anything except nginx and the apps. Mass configuring without any tools using only Claude. Works perfectly. The costs saved without all the overhead is massive.

daxesh_tech • today at 5:56 AM

[dead]

draw_down • yesterday at 1:56 PM

[dead]

alt Hacker News

Replies