
I received a question after my last blog post asking me to clarify the concept of within and between subgroup variation which is used in calculating Cpk, Cp, Cr, Ppk, Pp, Pr and other statistics. Here is an example I used to help explain the differences.
Let’s say that every day I run about 30 minutes with my chocolate labrador, Cadbury (pictured below).

While running, I decide to measure how fast we are going. I measure the speed (pace) three times throughout the run: toward the beginning, the middle, and the end of the run. This data tells me a few things:
1. The pace at the beginning, middle, and end of the run.
2. The average pace we keep. This average pace is also called an X-bar.
3. The difference between the fastest pace and the slowest pace, also called the range.
4. Cadbury, like me, has a lot more energy at the beginning of our run than at the end.
The data collected is easy enough to understand. But what if I want to know if the process is capable (of meeting some goal)? This is the same questions you no doubt have had about data collected for traditional quality improvement purposes. The tool I need to answer this question is capability analysis.
I am using the example of running to illustrate variability. As mentioned in #2 above, the average pace of a run is the X-bar. The range of the three measurements (#3 above) is also known as the within subgroup variation. This is sometimes referred to as the subgroup (or sample) standard deviation.
Let’s say Cadbury and I have run daily for 20 days. I took 3 measurements of my pace each day, so I now have 60 measurements. The variation of all 60 measurements is called the variation within and between the subgroups. This is sometimes called the total standard deviation.
Within subgroup variation (subgroup standard deviation) is used in calculating control limits and Cp, Cr, and Cpk. Within and between subgroup variation (total standard deviation) is used in calculating Pp, Pr, and Ppk. Keep the questions coming–Cadbury and I will try our best to answer them!
February 2, 2012 at 2:39 pm
when are total standard deviation and subgroup standard deviation the same?
February 3, 2012 at 8:25 am
Subgroup standard deviation is typically the variability within the subgroup of X measurement values. X might be 5 if your sample size is 5. i.e. How much variation do you have among the 5 values.
Total standard deviation is the variability with your subgroup AND between your subgroups. This is often referred to the standard deviation of the individual values.
When are they the same? When the variability within your subgroup is the same as the variability as the within and between your subgroups. In short, if your process does not have big swings from start to finish, your total variability will likely be similar to the variability you see within a sample.