> If the code in PROD was fine before the rollout, and there is a bug in the newly deployed code
Yes, if. My point is that that's a distinct subset of the possibilities, and you might not know whether it's true. Maybe you roll out a change at 10:00 and at 10:20 you start getting reports that certain key functionality isn't working. So you run your rollback... and now instead of 20% of users complaining, it's 100%, because that last update included a database migration. Okay, maybe you're very good about your data model so that doesn't happen, but you roll back... and nothing changes. It turns out that the problem was caused by a 3rd-party API you consume falling over. With the right guardrails, rolling back can often be an early option to improve the situation, but it's not a 100% cure-all.
Yes, if. My point is that that's a distinct subset of the possibilities, and you might not know whether it's true. Maybe you roll out a change at 10:00 and at 10:20 you start getting reports that certain key functionality isn't working. So you run your rollback... and now instead of 20% of users complaining, it's 100%, because that last update included a database migration. Okay, maybe you're very good about your data model so that doesn't happen, but you roll back... and nothing changes. It turns out that the problem was caused by a 3rd-party API you consume falling over. With the right guardrails, rolling back can often be an early option to improve the situation, but it's not a 100% cure-all.