It’s beyond time we reexamine the post-incident question of luck, and I submit: asking “how we were lucky” in an incident retrospective is not only a waste of you and your colleague’s time, it is distractingly detrimental to post-incident analysis.
An obvious reason why “luck” discussions are not useful: what are you supposed to do with aspects of the incident attributed to a good outcome of a roll of the dice? We don’t follow-up a “lucky” attribution by asking “What will you do next time to be lucky? How can we all get luckier in the future? Can you teach me how to flick my wrists just right when I let go of the dice?” Director James Cameron is oft-quoted as saying “Hope is not a strategy; luck is not a factor” and spending time talking about something, by definition, we have no control over is placing a pretty big bet on hope. For all of the discussions of “luck” I’ve heard in incident retrospectives, I’ve never seen anyone put “Be more lucky again next time” on the list of action items that goes to the boss’ boss.
A second, and more important, reason why an attribution of “luck” is unhelpful in incident analyses: its use often masks other aspects of the system that we can influence and when we chalk it up to “luck” and move on, we miss a big opportunity. In a recent retrospective, someone said “We were really lucky Sam happened to be on the call for this incident; he’s worked here a really long time, and he noticed that error code didn’t look quite right, which led us all to investigate and find the triggering issue. Nobody else would’ve found that as quickly.”
Source: What’s Luck Got To Do With It?, J. Paul Reed, https://jpaulreed.com/thoughts/whats-luck-got-to-do-with-it.html