Amazon Web Services Disaster: What Went Wrong — and Why It Could Happen Again

Millions affected as AWS outage exposes global cloud dependency

Millions of users and businesses were affected on 20 October 2025 when Amazon Web Services (AWS) suffered a widespread outage that disrupted major websites, banking apps and online services worldwide.

The failure, traced to a DNS configuration issue in the US-East-1 region in Virginia, prevented access to critical databases and led to service interruptions across gaming, social media, finance and retail platforms. AWS engineers said the issue was identified and mitigated within hours, restoring normal operations later in the day.

DNS Failure Cripples DynamoDB and Dependent Systems

According to AWS status reports, the outage began late on Sunday evening in the United States, when engineers detected increased error rates and network latency.

The fault was linked to Domain Name System (DNS) resolution problems affecting DynamoDB endpoints, a core service that underpins thousands of cloud-based applications.

When the DNS fault occurred, connected systems failed to locate database resources, triggering cascading errors in Amazon Lambda, EC2 and API Gateway operations.

PBS News reported that the outage rippled across multiple sectors, exposing how dependent much of the digital economy remains on AWS infrastructure.

Gaming platforms such as Fortnite and Roblox were hit alongside Snapchat, Signal, Coinbase, and Robinhood, with users reporting login failures and transaction errors.

Smart-home systems including Alexa and Ring also experienced degraded performance. Some government websites in the UK temporarily went offline, underscoring the global scope of the incident.

Centralisation Risks Raise Alarm Among Experts

Technology analysts cited by WIRED said the incident revealed the fragility of the internet's core infrastructure and the risks of over-reliance on a single cloud provider. AWS's US-East-1 region, based in Virginia, hosts a disproportionate number of workloads for global companies because of its low latency and cost efficiency.

Experts noted that even with redundancy built into AWS, regional interdependencies mean that a single misconfiguration can have far-reaching consequences. The cloud computing outage reignited calls for greater diversification through multi-region and multi-cloud strategies.

'The lesson here is resilience,' Ookla industry analyst Luke Kehoe said, as reported by CNET. 'Many organizations still concentrate critical workloads in a single cloud region. Distributing critical apps and data across multiple regions and availability zones can materially reduce the blast radius of future incidents.'

Similar incidents in 2020 and 2021 had already prompted concerns about concentration risk within the cloud sector.

Business and Regulatory Implications

Following the disruption, several financial regulators in the United Kingdom and European Union have urged organisations to reassess their third-party risk management and service-level agreements with cloud providers.

While AWS stated that the issue was not caused by a cyberattack, officials said even benign misconfigurations could have severe consequences for essential services.

Industry experts recommend that businesses deploy critical workloads across multiple availability zones and providers. They also advise conducting disaster-recovery tests, implementing graceful-degradation protocols, and ensuring that DNS fallback mechanisms are properly configured.

The Growing Need for Cloud Resilience

Although AWS restored its services within hours, the outage has once again exposed how interconnected modern digital infrastructure has become.

Analysts warn that similar incidents could recur unless global organisations strengthen resilience planning and reduce their reliance on single-region configurations.