What’s the probability and impact of inaccurate values in your risk database? In other words, have you done a risk analysis on your risk analysis?
I know this sounds circular, and I am being a little facetious here, but this should be a real concern for you. What happens if your company decides a risk has a 10% probability when it really has a 90% probability? The whole purpose for modeling risk, is to prepare yourself for the unknown, and you cannot be properly prepared if your data isn’t accurate. It’s like trying to drive your car across the country on one tank of gas. Your intentions are in the right place, but your assumptions will get you in trouble. This is the same thing that can happen at your company, so we’ll spend some time today on how you can help your company get it right.
The Problem with Bad Probabilities
Last week in Beyond Compliance – Understanding Risk, we started with some architecture for modeling risk, and identified these properties of risk:
- Risk Name – Name the risk ( i.e. fraud )
- Probability – The probability that the risk will occur.
- Impact –What will happen if the risk occurs.
- Detectability – How easy it is to detect the risk.
You can of course expand on this, however it’s important to remember these three properties about risk, as they actually serve a purpose over and above qualitative composition. Among other places, these properties are used in Six Sigma when constructing an FMEA ( Failure Modes and Effect Analysis ). An FMEA is one of Six Sigma’s primary ways of characterizing and analyzing the different types of risk that could show up in a process or project.
To do the analysis, all three of these properties ( probability, impact, and detectability ) need to be quantified ( converted to a numerical representation ), and normalized ( set on the same scale ). That’s why in last week’s discussion, I suggested a numerical representation of all three numbers ( indicated by the _GUAGE suffix ), and further suggested that all three hold a range between 1 and 100.
The reason why these numbers need to be quantified and normalized, is because to complete the FMEA, you will need to multiply all three of these numbers together, to arrive at a Risk Priority Number ( RPN ). The RPN will then help separate high risks from low risks.
So with this in mind, can you imagine what would happen to your RPN if your inputs are wrong? Obviously, your RPN would be greatly skewed, and the whole analysis would be worthless. Unfortunately, this happens a lot in the real world. We’re going to make sure it doesn’t happen to you.
Out of the three properties, in my experience probability is the hardest to get accurate. Impact is largely subjective so there’s no real “wrong” answer, and detectability is pretty easy to discern. Probability however, must be calibrated as there is a “true” answer, and if you guess at it, you could be wrong.
The Financial Risk Data Mart Revisited
Previously, in How To Help Your Company with Financial Risk, I introduced the concept of a Financial Risk Data Mart. Today, we’re going to leverage this concept to get our risk probability as accurate as we can. You can’t do this with everything, but when you can it’s a very useful strategy.
To make this work, let’s profile a risk:
- Risk Name : Risk of Inaccurate Data Entry
- Risk Description : The risk that an inexperienced data processor will enter the wrong value
- Control Measure: Redundancy – We will have two different people enter the exact same data. Anytime there is a discrepancy between the two entries, we have an exception.
- Probability : ???
- Impact : Minor ( Score: 10 out of 100 ). Exceptions will be highlighted, and correct data values will be re-input.
- Detectability : Very High ( Score: 10 out of 100 ). With the control in place, this is very easy to detect ( note that when the detectability is high, the score is low ).
As you can see, with the control in place, this is a pretty low risk item. However, we’ll use it for illustrative purposes.
We will use our risk data mart to inform our probability metric here. You can use the same data mart with all your risks, by simply making risk a dimension.
The metrics that you will capture in your fact table will be observations per day ( # of data entry points ) and exceptions per day ( # of times the data points did not match up between the two data processors ). You can then find the exception percentage for that day by dividing the number of exceptions by the number of observations. For instance, if there were 1000 data entry points entered on a day, and 35 had conflicting entries between the two data processors, then the exception percentage would be 3.5 ( [ 35 / 1000 ] * 100 ). This exception percentage will inform the overall probability for your risk.
Using the Exception Percentage to Get Risk Probability
The mistake most people would make at this point is to directly translate 3.5% to the risk probability and call it a day. As you can tell by my language, this is the wrong approach. This is called a point estimate, and it’s is never correct.
Now, if you said you would collect 30 days of data and take an average, then I’d say you’re getting much closer to the answer. The only problem with this approach is that one day can really mess up your average. Let’s say you had 29 days where the value was between 3 and 5, and then one day for some strange reason you recorded a 95! Your average could be 6 or 7 which probably isn’t an accurate risk probability.
This is actually a trick question. The best answer will involve more than an average. To characterize anything like this, you need some measure of central tendency ( i.e. average ), some measure of spread ( i.e. standard deviation ), and some idea of the distribution’s shape ( i.e. normal, logarithmic, etc. ).
For central tendency, I would suggest taking the mean ( average ), median ( 50% percentile ), and mode ( most common value ) and compare them. If they’re all about the same, you’re in good shape – go with the mean. For spread, as long as you’re looking at 30 days, I think standard deviation is okay unless your distribution shape is not normal. The best way to get your distribution shape is to just plot the values, and look at it. If it looks like a bell curve, this is a normal distribution, and once again you’re safe using the mean and standard deviation. If not, consult a statistician or a Six Sigma Black Belt.
A Lot of Work for One Value
I know it seems like a lot of work just to get one value, but it’s absolutely necessary. If your risk probabilities are inaccurate, that could invalidate all the rest of your assumptions. Trying to get probabilities correct can be tricky, but you can leverage the power of your risk data mart to get as close as possible. Take some time today to re-evaluate your highest risks.