Menu

Post image 1
Post image 2
Post image 3
1 / 3
0

The Backup That Wasn't

DEV Community·Vivian Voss·about 1 month ago
#iPWhWu8j
Reading 0:00
15s threshold

Tales from the Bare Metal, Episode 01 « Thou shalt not trust a backup thou hast not restored! » At half past eleven on the night of Tuesday, 31 January 2017, an engineer at GitLab.com typed rm -rf on what they believed was the secondary PostgreSQL database. The terminals on their screen were visually identical, save for the hostname in the prompt. Two seconds later, when they realised the prompt did not say what they thought it said, they killed the command. By that point, three hundred gigabytes of production data had been removed. That was the easy part. The hard part came over the next eighteen hours, as the team discovered, in a sequence that has since become teaching material, that none of their five backup mechanisms had been working. This is a long-documented incident. GitLab's response, by industry standards, was extraordinary. They live-streamed the recovery on YouTube. They published their internal chat logs.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More