I dropped the production database at the first startup I worked at, three days after we went live. We were scrappy™ and didn’t have backups yet, so we lost all the data permanently. I learned that day that running automated tests on a production database isn’t a good idea!
Here is another one: Don't trust ops when they say they have backups. I asked and was told there are weekly full backups, with daily incrementals. The time came when I needed a production DB restored due to an upgrade bug in our application. That was bad - thank $DEIITY we have backups.
OPS: Huh, it appears we can't find your incremental.
ME: Well just restore the weekly, its only Tuesday.
Two Days later.
OPS:About that backup. Turns out it's a backup of the servers, not the database. We'll have to restore to new VM's in order to get at the data.
ME: How did this happen?
OPS: Well the backups work for MSSQL Server.
ME: This is PostgreSQl.
OPS: Yeah, apparently we started setting that up but never finished.
ME: You realize we have about 20 applications using that database?
OPS: Now we do.
Lesson: Until you personally have seen a successful restore from backup, you do not have backups. You have hopes and prayers that you have backups. I am forever in the Trust but Verify camp.
If your company is big enough to have dedicated ops then it should be running regular tests on backups. A disaster recovery process if you will.
At some point though its not your problem when the company is big enough. Are you gonna do everyone's job? You tell em what you need in writing and if they drop the ball its their head.
The majority of our apps were Java, running Tomcat on windows server using MSSQL or oracle. That was tested as part of DR. Our Linux servers running Python and Postgres were not as high a priority apparently.
The lack of working backups made it a problem because if assurances and certifications we were required to maintain.
When starting a new project I now request a dev database with a dump from prod more than 30 days ago just to see the process work. Does it waste their time? Maybe. In which case it encourages more automation. Do I care no? But I am not getting burned again.
It’s relative. No, I’m not sitting on the shoulder of the team that manages that (nor should I, there’d be 40 EMs bothering them!) but I fully expect my CTO has done it. And if not? Well, one day it’ll blow up and I’m looking for another job but that’s no different to any other possible major issues.