Published on [Permalink]
Reading time: 2 minutes
Posted in:

What happens when we all put our heads in the cloud

Yesterday’s AWS outage reinforces an old truth: the cloud is—oversimplified but not wrong—someone else’s computer.

This isn’t an argument against cloud adoption. It’s about trust distribution and control boundaries. How much faith do you place in a single provider, how much can you afford to take advantage of their redundancies—or can you afford not to? What’s beyond your control regardless of SLAs? What do those SLAs actually mean? Are you hedging those bets?

The pattern looks familiar: numerous organizations, despite apparent diversity in their offerings, seem to have concentrated their infrastructure in AWS US-East-1. No real surprise. What’s telling is (based on reporting to date) the apparent lack of tested multi-region failover capabilities. When this reportedly DNS-originated incident hit, rapid adaptation failed—if it even existed.

I don’t operate at hyperscale. I am a web developer who evolved through digital professional services into a technology manager. My experience comes from managing other technologists, engineers, and vendors for what was, in terms of staffing a larger medium-sized business where our budget probably couldn’t match what these tech giants spend on toilet paper. Yet we maintained business continuity—for a 24/7 crisis line—through outages. When any cloud component failed, whether SaaS platforms or our AWS deployments, we retained operational capability through alternative pathways, including fallback to “pre-cloud” configurations.

Our teams' recovery strategy addressed both RTO and RPO intuitively, before we knew what those terms were. We identified single points of failure, built redundancy where cost-effective, and accepted calculated risks elsewhere. Most importantly, we tested these failover procedures regularly.

This sounds like common sense. Experience suggests otherwise.

If your organization needs someone who treats information systems and infrastructure as a product serving your team, someone who speaks plainly and pragmatically about availability, threat models, and builds contingency plans that actually meet reality, especially if you’re trying to make the world a better place, I am interested.