We had a migration script that failed to run before the rest of our services deployed. This caused the newly deployed services to crash because the database was missing a column in a critical table. Our deployment orchestrator quickly detected the issue and reverted to the previous build. Total downtime was approximately 2 minutes.
We plan to mitigate future failures by requiring our migrations to complete before rolling out our remaining services.