5 Measuring Variance

Objectives

In this section, we will discuss:

The three numbers that you will need to calculate Days in AR.
What those three numbers mean and where to find them.
The steps in the calculation of Days in AR.

\[\\[3pt]\]

5.1 Standard Deviation

The single measure of variation that reveals more than others, the standard deviation measures variation in a set of values relative to the mean. The bigger the standard deviation, the greater the range of variation relative to the mean. The standard deviation can be determined as follows:

Calculate the mean of the set of values.
Subtract each individual value in the set from the mean, resulting in a list of values that represent the differences of the individual values from the mean.
Square each of the values calculated in Step 2.
Sum the values calculated in Step 3.
Divide the value calculated in Step 4 by the total number of values.
Calculate the square root of the result from Step 5.

5.2 Multivariate Analysis Visualization

5.3 Tukey’s Control Chart

In the absence of sufficient historical data, it would seem impossible to determine anything from, say, 12 data points. What legitimate insight could you offer from analysis of such little information? Let me introduce you to Tukey’s Chart. In essence, statistician and mathematician John Tukey came up with the next best thing: dividing your sample into medians and fourths to overcome insufficient observations per period. It’s not perfect, but it is an excellent way to measure variance in a small dataset of time-based observations.

The procedure for calculating control limits is to calculate the difference between the Upper Fourth and the Lower Fourth of the data, a concept Tukey named the Fourth Spread.

The median is the point where half the data are below the mid-point of a set of data and half the data are above it.

A Lower Fourth is similar to the 25% quartile, the median of the first half of the data (25% of the data are below this value.)

An Upper Fourth is similar to the 75% quartile and is the median of the upper half of the data (75% of the data are below this value.)

The difference between the two Fourths is referred to as the Fourth Spread.

The Upper Control Limit (UCL) is calculated as the sum of the Upper Fourth and 1.5 times the Fourth Spread.

The Lower Control Limit (LCL) is calculated as the difference between the Lower Fourth and 1.5 times the Fourth Spread.

5.3.1 Calculation of Tukey’s Control Limits

I’ll use the GCt column of data from this dataset, calculate the control limits, then plot the LCL and UCL on a graph and analyze the results:

\[\\[3pt]\]

Calculate the median. If the number of observations is odd, use the value in the middle. If there is an even number of observations, take the average of the two middle-ranked numbers.

## [1] 155483

Divide the data set into two halves using the median. Include the median in both halves if the median is one of the observed data points.

## [1]  86047.0 123654.0 131440.3 146878.1 151410.7 153991.0

## [1] 156975.0 163799.4 169094.5 198655.1 297731.7 325982.0

The Lower Fourth data point is the median of the lowest 50% of the data - data from the smallest number up to (or including) the median.

## [1] 139159.2

The Upper Fourth is the median of the top 50% of the data - numbers ranging from (or including) the median of the full data set to the highest value.

## [1] 183874.8

Calculate the Fourth Spread as the difference between the two Fourths.

## [1] 44715.6

Calculate the UCL and LCL using the following two formulas:
- LCL = Lower Fourth - (1.5 x Fourth Spread)
- UCL = Upper Fourth + (1.5 x Fourth Spread)

## [1] 72085.8

## [1] 250948.2

\[\\[3pt]\]

The chart shows that for total Gross Charges, January and February are the only months not within the control limits. For Ending AR Balance, February is the only month out of bounds, ending up just barely beyond the UCL.

In interpreting a control chart, we rely on the source of the control limits to define the reference point. If control limits were derived from historical patterns, we would compare the dates to historical patterns. If control limits were calculated from expected patterns (e.g., risk-adjusted patterns), the comparison group is the pattern expected.

The reference here is not historical patterns, but total Gross Charges and Ending AR balances that will result in the targeted Days in AR benchmark.

\[\\[3pt]\]

4 DAR Percentages

6 Aging