logoalt Hacker News

Cell-based architecture for resilient payment systems

142 pointsby birdculturelast Monday at 10:36 PM58 commentsview on HN

Comments

Insimwytimtoday at 12:32 AM

Whole lot of nothing.

This isn't about payment technologies, it's not about isolating transactions, it's about scaling the middle layer. What's worse it's not even explained what middle layer does.

No info on how routing works, no info on data synchronization.

Folks just learning Kubernetes and write extremely abstract stuff.

show 1 reply
physixyesterday at 10:28 PM

Nobody uses Amex for payments, so the system isn't ever under high load.

Just kidding!

I find the idea quite good, and have to assume that the amount of payment fails they experience due to partitions/outages isn't very high and that the post-payment reconciliation and reclamation process gives them the liberty to rank availability a bit higher than correctness.

One thing that looked a bit shaky was the interplay between the global transaction router's state of knowing which cells can handle a particular payment and the asynchronous distribution of the "failover data", which I presume it needs to know to route correctly. To me that seems to create a window where it might route to the wrong cell due to an outdated routing state.

It also doesn't go into the HA setup of the global transaction router itself.

But still, I kind of like the design.

show 1 reply
rdtsctoday at 3:35 AM

Some of it sounds like it reinvented Erlang supervision trees https://learnyousomeerlang.com/supervisors. As a joke there we’re calling gen_severs “nanoservices”. Granted, that was mostly when microservices were the hot new thing.

show 1 reply
mkhaliltoday at 12:53 AM

microservices / clusters / zones - really all of these are other "cell-based" architectures as well. there is absolutely no written rule that a microservice was just an API or a singular service, it basically can be a independent instance that is testable/usable/gives value on itself.

that said: still a nice write up, learning about some of the architectural choices that AMEX makes is definitely insightful (and relavent/useful to what i am working on right now as well!)

nightshift1yesterday at 11:26 PM

All i can see is a giant single point of failure called the Global Transaction Router.

show 2 replies
chmod775today at 4:27 PM

As described, if one "cell" crashes, you re-route/retry everything on the other cell. Assuming your system is deterministic and the inputs stay the same, the second cell should now break as well.

This reads to me like an attempt to patch a system that's already fucked beyond belief while pretending you're doing "engineering".

Fancy implementation of a retry loop attempting to minimize downtime.

inigyoutoday at 12:15 AM

403 Forbidden

Because of the title I was expecting to read about doing payments with a distributed network, like a terrorist cell network, or something like Hawala. Not (as I infer from other comments) Amex using multiple independent systems.

bozharktoday at 3:50 PM

Blockchain payments lite

daxfohltoday at 4:36 PM

"We broke our monolith into microservices and wish we hadn't"

neerajsiyesterday at 9:39 PM

I wonder how they ensure durability. Is it possible that a cell going down would roll back a payment after it has occurred. Or do they depend on a non cell database?

show 1 reply
kev009yesterday at 9:22 PM

There things are always a clusterfsck compared to the mainframe deployments.

show 1 reply
stevefan1999yesterday at 11:04 PM

Backing up would be hell

show 2 replies
jeremycarteryesterday at 9:42 PM

As Reddit already pointed out, this is nothing novel.

show 2 replies
badlibrarianyesterday at 10:35 PM

Ah yes, the financial services company that runs a travel agency, allows me to book my hotel and rental car weeks in advance, registers a hold for incidentals for both the hotel and car when I check in, then blocks the card when I try to buy dinner that night in that same hotel due to fraud detection.

Last week it required me to take pictures of my face from multiple angles to regain membership privileges. I suspect this may be part Palantir data collection and part Peter Thiel dating service.

charcircuittoday at 1:59 AM

This service oriented architecture except more expensive and complicated.

llmslaveyesterday at 10:20 PM

American Express tech is some of the worst in the world among big companies. All of the value in the company is just in the branding. They put some work into the mobile app and the website, but other than that, its a facade.

show 4 replies
greengreengrasstoday at 11:12 AM

[dead]

great_wubwubyesterday at 11:13 PM

Makes me a little nervous that a web page about resilience is failing to connect.

toast0last Monday at 11:34 PM

They run their payment systems on ps3??? Somebody bought into the marketting a bit much.

rekttraderyesterday at 9:19 PM

So you’re telling me these cells operate independently like distributed Ethereum nodes and L2s… got it.

show 1 reply