Early in my career we used a process that loosely resembled a root cause analysis after a severity 1 production outage. The intent of the process was to determine why the severity outage occurred and then fix the problem so it didn’t happen again. No one liked process and the documents we produced were rarely used to influence process improvement. It was a checkbox and an exercise to fill-in-the-blanks to say we completed it. I always thought the name post-mortem was bit odd as well and we were certainly dead to the process. Looking back, I see post-mortem efforts can be valuable if championed and executed correctly. But there is a better way.
Twenty years later, we are learning to implement root cause analysis (RCA) into our recurring operational procedures. Like a post-mortem exercise, a RCA is typically done after an event has occurred with the intended benefit to prevent problems from recurring. If done correctly, this can reduce waste and downtime.
But a RCA is distinct with its own set of advantages. Our team is using lean A3 problem solving techniques as the backbone for RCAs. It is apparent to me the RCA process, if supported and executed routinely, can shape a culture of continuous improvement. Here are a few practical ways:
- The outputs can be used as a proactive measure to predict and prevent future failures. Problem solving focuses on examining why events occur coupled with action items and sustainment activities. This is a great way to identify potential future problems.
In one recent 5-why exercise about a database failure we identified a few weaknesses in a process in addition to the root-cause of a failure. Our corrective action plan addressed multiple weaknesses and has undoubtedly prevented some of the weaknesses from becoming service outages.
- A systematic approach to RCA involves setting a recurring cadence for problem solving. RCAs require a wide range of knowledge to identify problems, compile documentation, and create sustainment activities. Individuals will struggle, but teams can thrive solving problems like this.
We post our RCAs on our department flow-and-performance board to make them visible, promote discussion, and to keep the process top of mind. Our standard is to perform one RCA per month. This reinforces that RCAs are part of the culture of the team.
- Done correctly, RCA focuses on resolving process deficiencies instead of blaming people. It’s not always easy but we remind ourselves to focus on behaviors and results over individuals.
Onward and Upward!
Photo Credit: ResoluteSupportMedia via creative commons – https://flic.kr/p/88Kdgw