Lesson learned: If your recovery plan requires calling any API in the dead region — to detach an IP, describe a route table, launch an instance, read an S3 object, or decrypt a volume — it will fail when you need it most.
Every dependency on the primary region is a dependency on the thing that just broke.
In the eighties my friends and I used to think we would be the first to go in a nuclear strike because we were close to an American air base. Now I have to worry about living close to an Amazon data centre.
Hmmm. We’ve seen the fun that comes from cutting data cables and pipelines. Think that’s been factored in with the asymmetric warfare coming from the Middle East? Perhaps some network assaults as well?
Krugman has pointed out that modern war is bloody expensive. Perhaps resistance will just be helping burn money? Lots of motivated people on one side. And I hope countries are being careful, as a Thirty Years War in the Middle East would suck.
the DR test isn't 'can we run in region B.' it's 'can we cut over to region B when every API call to region A returns a timeout.' most recovery plans assume they can still reach the thing that just broke
Related discussion: https://news.ycombinator.com/item?id=47209781
Any chance this is causing the claude issues directly/indirectly?
More recent news: https://www.businessinsider.com/amazon-data-centers-middle-e...
> Two facilities in the United Arab Emirates sustained direct hits, while a third facility in Bahrain was damaged by a drone strike "in close proximity,"
Also to add context: AWS has contracts with the US military: "The Joint Warfighting Cloud Capability (JWCC) contract enables AWS to continue providing Department of Defense (DoD) customers with secure, reliable, and mission-critical cloud services." https://aws.amazon.com/federal/defense/jwcc/ Making them a target for retaliation ofc.
Typical BBC reporting: Amazon's cloud computing business says drones have hit three of its facilities in the United Arab Emirates (UAE) and Bahrain following US and Israeli strikes against Iran at the weekend. The incidents occurred on Sunday morning, with Amazon Web Services (AWS) saying at the time that ''objects'' had hit a data centre in the UAE, creating ''sparks and fire''. Also on Sunday, AWS said it was investigating power and connectivity issues at a facility in Bahrain. On Monday, the company confirmed that drone strikes had caused the outages.
Notice how they do not mention that the facilities were damaged by Iranian attacks on the UAE and Bahrain but following US and Israeli strikes against Iran at the weekend.
I've been working on that for a client since yesterday (as a fractional CTO). Pretty hectic, basically nothing really works and we don't know yet if all data is lost or if anything is recoverable or when AWS UAE will become functional again so we can recover that region.
Finally, I have a very good argument for multi-region deployments ;))
that's my go to website atm: https://health.aws.amazon.com/health/status