logoalt Hacker News

xiphias2today at 7:35 AM1 replyview on HN

Cloudflare as well


Replies

talonxtoday at 1:35 PM

Services like Cloudflare and Twilio have so many POPs globally that one or more always have an outage going on. Then there's the question of whether it's a major outage or a minor outage. Even though major status page providers like Atlassian and Incident.io have public status APIs (Cloudflare uses Atlassian), it takes more than just parsing them to determine what is "down" and at what granularity.

I run an outage detection service - and some of these issues, like parsing hundreds of - sometimes undocumented - status APIs, make for an interesting engineering problem.

show 1 reply