logoalt Hacker News

embedding-shapetoday at 1:42 PM1 replyview on HN

Why wouldn't your monitoring system alert you when metrics suddenly disappear? Sounds like you need a better monitoring system, prometheus is not gonna magically solve that problem for you. No wonder you were having issues with prometheus...


Replies

dijittoday at 1:59 PM

I'm not sure what you mean.

Of course the systems that have to alert me to failure have to be designed with mechanisms to alert me to the fact that they themselves are failing.

Zabbix, Nagios, Munin -- practically everything that existed before: understood this.

Prometheus doesn't because it optimised intentionally for being easy to deploy and for there being a hierarchy of prometheus's in a tree-like formation. Which makes sense, but forces a much more distributed and difficult to reason model.

Monitoring systems can't be designed for the happy path. By definition, they only matter when things are going wrong- which is precisely when the happy path isn't available. Prometheus is excellent when everything is fine (scaling aside). That's not when you need your monitoring system to be excellent.

show 1 reply