Building in crypto doesnโt automatically mean infra is decentralized.
The assets may live on decentralized networks. But traditional infrastructure risks still apply if the exchange itself depends on centralized cloud providers, physical datacenters, and single coord layers.
On May 7th Coinbase experienced service disruptions. Hereโs a quick summary of what happened:
โ Around 8PM ET, Coinbase systems flagged high error rates across multiple services.
โ We traced these errors to amazon failures in Availability Zone (use1-az4) in the AWS US-EAST-1 Region.
โ Coinbase systems are designed to be resilient to a single zone outage, and are designed to recover quickly if this happens.
โ In this case, we observed failures impacting multiple AWS zones, which caused an extended outage of core trading services.
โ Coinbase users experienced an extended outage while the AWS team worked to restore temperature controls and other Amazon Managed Services.
This primary issue is now fully resolved - thank you for your patience. If you have any outstanding questions about your account, please reach out to Coinbase Support, weโre ready to help.
Our team will conduct a full analysis. Details may change as our investigation progresses and more information is received from AWSโs official retrospective, once published.
This outage highlights a bigger infrastructure issue.
When critical systems rely on centralized cloud environments, even strong redundancy can still inherit physical datacenter risks. An overheated room should not have the power to disrupt major financial infrastructure.
We experienced an outage at Coinbase last night, which is never acceptable. The root cause was a room overheating in an AWS datacenter when multiple chillers failed. We design our services to be redundant to downtime in any one AWS Availability Zone (AZ), and most of our systems worked this way last night, but not all.
Our centralized exchange did not. Exchanges have unique architectures that optimize for latency and co-location of clients. It is possible to make exchanges resistant to AZ failures, but this can introduce latency delays that are not desirable along with breaking customer co-location. Given this incident, we'll revisit these tradeoffs to ensure we're giving you the best possible venue to trade. At a minimum, the duration of an outage should be able to be reduced considerably when an AZ move is needed.
Thank you to the AWS and Coinbase teams for working through the night to mitigate the issue. Weโll share the detailed technical summary once it's ready.
Just wrapped Day 1 at DevWorld Amsterdam.
Met incredible builders, had sharp conversations, and one thing is clear: developers are actively looking for better infrastructure, stronger privacy, and systems that actually scale.
Day 2 let's gooo!
all of this is accurate. the architecture genuinely is excellent.
the one thing worth knowing is that Tailscale's control plane is still theirs.... the mesh is peer-to-peer but the coordination layer isn't. if that matters to you (and for a lot of teams it increasingly does) the true self-hosted path is worth looking at.
we built @netcore_inc around the same WireGuard + mesh primitives but with no third-party coordination. curious what the thread thinks about that tradeoff.