Amazon Calls Emergency Engineering Meeting After AI-Generated Code Causes 'High Blast Radius' Outages
Amazon's top retail tech executive convened an urgent 'deep dive' after four severe outages in one week were linked to generative AI-assisted code changes.
Amazon's top retail technology executive has convened an urgent "deep dive" meeting after a string of severe outages affecting the company's website and shopping app — incidents the company has directly linked to generative AI-assisted code changes. The meeting, first reported by the Financial Times and confirmed by CNBC, represents one of the clearest real-world signals yet that AI coding tools are creating new operational risks at massive scale.
Four Sev-1 Incidents in One Week
Dave Treadwell, Amazon's Senior Vice President of eCommerce Foundation, told employees in an internal memo that the company's website availability "has not been good recently." He revealed that Amazon experienced four Sev-1 incidents — the highest severity classification, indicating outages or degraded performance of critical systems — in a single week.
"Folks — as you likely know, the availability of the site and related infrastructure has not been good recently." — Dave Treadwell, SVP of eCommerce Foundation, in a memo to Amazon employees
The most visible incident occurred last Thursday, when Amazon's online store malfunctioned for roughly six hours. Website and app users were unable to check out, access account information, or view product prices. Amazon attributed the issues to a "software code deployment."
GenAI Tools: The Smoking Gun
In a separate memo viewed by CNBC, Treadwell identified "genAI-assisted changes" as one of the contributing factors to the recent incidents, dating back to the third quarter of 2025. He specifically pointed to "GenAI tools supplementing or accelerating production change instructions, leading to unsafe practices."
The admission is striking. Amazon has been aggressively pushing AI adoption internally — its own Kiro AI coding tool launched last July, and the company has been measuring employee performance partly on how much they use generative AI. This comes on the heels of laying off approximately 30,000 corporate workers over the past six months alone, with AI capabilities cited as one of the justifications.
New Guardrails Going Up
Amazon is now implementing what Treadwell called "temporary safety practices" to prevent further issues:
- Senior engineer review required for all GenAI-assisted production changes made by lower-level staff
- "Controlled friction" added to changes in the most critical parts of the retail experience
- Investment in "deterministic and agentic safeguards" as longer-term solutions
"We are implementing temporary safety practices which will introduce controlled friction to changes in the most important parts of the Retail experience, in parallel we will invest in more durable solutions including both deterministic and agentic safeguards." — Dave Treadwell
The company also made its usually-optional weekly engineering meeting — called "This Week in Stores Tech" (TWiST) — mandatory for all engineers, shifting its focus entirely to discussing the outage pattern.
The AWS Connection
Amazon Web Services has also faced its own outage issues recently, though the company said Tuesday that the cloud group was not involved in the specific incidents referenced by Treadwell. However, the FT reported that an AWS incident in December — which took down a cost management feature for an extended period — occurred after engineers allowed Amazon's Kiro AI coding tool to make changes. Amazon said at the time that the outage was the result of "user error" and not AI.
A Cautionary Tale for the Industry
The irony is hard to miss. Amazon is simultaneously spending $200 billion on AI infrastructure this year — more than any of its tech peers — while dealing with the fallout of AI tools breaking its own systems. The company laid off 16,000 workers in January and 14,000 in October, often citing AI-driven efficiency, while now needing to add human oversight back into AI-assisted workflows.
For the broader tech industry, Amazon's experience is a cautionary tale. Generative AI coding tools can accelerate development, but they can also accelerate bad deployments, obscure accountability, and multiply the impact of mistakes. Treadwell acknowledged in his memo that "best practices and safeguards" around generative AI usage haven't been fully established yet — a remarkable admission from a company that has been among the most aggressive AI adopters in the world.
The question now isn't whether AI should assist in software development — that ship has sailed. It's whether companies can build governance frameworks fast enough to keep pace with adoption. Amazon's answer, for now, is to slow down and add humans back into the loop.
0 Comments
No comments yet. Be the first to say something.