On Monday, October 20th, Amazon’s US-East-1 data center cluster had a very bad day. The problems started around 7:00 AM GMT and didn’t let up for over 15 hours. Since US-East-1 is AWS’s oldest and biggest cluster, the damage was extensive — over 1,000 companies experienced downtime, including ChatGPT, Lyft, Facebook, Duolingo, Zoom, Reddit, Venmo, and Coinbase. People couldn’t withdraw cash, book flights, or summon rides. A Premier League match lost its automated offside detection. The outage was so overwhelming it dominated news cycles worldwide, with major outlets like BBC running live coverage.

The AWS disaster didn’t touch a single Adapty client, even though we partly rely on AWS infrastructure too. When our clients use Adapty, they’re trusting us with their entire payment infrastructure. Over 15,000 mobile apps depend on us staying online. If Adapty goes down, our clients stop making money. Their users literally cannot buy subscriptions. Since infrastructure is our product, reliability isn’t a feature. That’s the whole point.
Building infrastructure that refuses to break
Adapty runs on hybrid infrastructure — part on-premise servers, part cloud services. We use load balancers to distribute traffic and obsess over redundancy. The core philosophy: don’t depend on any single cloud provider. Make the system work under any scenario. Even if the rest of the internet is on fire (though at that point, we’d all have bigger problems).
Additionally, we operate a global CDN with over 400 locations. This lets us process requests locally. Your API call gets answered by whichever server is geographically closest. That architectural decision matters when you’re handling over 4 billion requests daily, sometimes spiking to 10 billion.
Fallback mechanisms: The paywall must always load
We can invest in redundancy and architect for failure, but we’ll never reach 100% uptime. AWS can knock out entire regions. CDNs can hiccup. Networks fail in unexpected ways. What we don’t accept is our clients losing revenue because of it. So we built fallbacks for every conceivable failure scenario.
Buying a subscription breaks down into three stages.
First: Display the paywall and products
The paywall is everything in a subscription app. It must load every single time, regardless of infrastructure chaos or internet problems.
Normally, the Adapty SDK makes an HTTP request to our server. All backend requests route through our CDN. If the paywall is cached there, that’s what displays. It’s much faster than waiting for a full server response. No CDN cache? The response comes from Adapty’s backend. Backend unavailable and no CDN cache? A locally cached paywall loads.
In case everything fails and there’s no local cache either — say, the user just installed the app — our “fallback backend” activates. We built this specifically as insurance, with 99.999999999% uptime. It stores paywalls for all placements. And if somehow that fails, a local fallback paywall loads (assuming the developer configured one).
Second: Process the payment
This happens entirely on Apple or Google’s side. We can’t add safety nets here, but we trust these companies to handle payments reliably.
Third: Grant the subscription entitlements
After payment, the backend must process that transaction and assign the appropriate user entitlement. Otherwise you get a nightmare: payment succeeded, subscription didn’t activate.
The latest version of the Adapty iOS SDK addresses this by saving entitlements locally if our backend is unreachable. The user experience stays intact. We’re rolling this out across all our other SDKs.
Continuous optimization
Adapty maintains 99.99% uptime with a dedicated 20-person team monitoring infrastructure 24/7/365. There’s always an engineer on duty. We use Grafana, pinging our infrastructure from eight global checkpoints every 15 seconds, tracking response codes and latency. If a ping fails, an alert fires. If that alert isn’t resolved in 5 minutes, it escalates to DevOps. We catch problems immediately and fix them fast.
Beyond monitoring, we’re constantly optimizing. Recently, we moved A/B test processing through the CDN, dropping paywall load times in those tests to 300-500 milliseconds. Adapty holds SOC 2 Type 2 certification, which confirms we’ve built robust availability monitoring.
Computers are hard. Keeping them running is harder. But when 15,000 apps depend on your infrastructure to make money, you don’t have the luxury of downtime.




