It has been estimated as many as 50% of products and services have demand patterns with “lots of zeroes,” which creates special challenges for demand estimation. The failure to handle those “lots of zeroes” correctly can cripple the effectiveness of an operational process from semiconductors to cell phones.
However, this is not an insurmountable problem. By understanding the basics of intermittent demand, and following a few guidelines, it’s possible to manage the issue. First, though, it’s important to start with a couple of basic assumptions:
- Standard metrics for forecast accuracy are not only wrong, but they will also get you into a lot of trouble and mess up your business.
- The key metric to consider is business impact. Organizations should create a risk profile that looks at the probability of demand occurring across time or possible lead time.
It is very helpful to divided products with “lots of zeroes” into two groups:
- Structural zeroes: The zeroes in this group have a pattern that relates to the structure of the supply chain or data collection methods.
- Intermittent /sparse/ lumpy zeroes: In this scenario, demand has lots of zeros spread randomly across time.
The examples in this blog will assume four years of demand history where the time bucket is months. That represents 48 total observations. Tables 1, 2, and 3 provide examples of structural zeros.
- Table 1: The zeroes are at the start of the history – indicating the product was not active at this time or the demand data was not collected. Often a “zero” as opposed to a null is used as filler.
- Table 2: Every other cell is zero, this often occurs if the demand collection system only grabs demand every other month.
- Table 3: Has zeroes in a block of months (April – August) This would indicate a structural item that drives demand to zero during this time period. For example, demand for flu shots would be seasonal.
Intermittent (other terms used are sparse and lumpy) refers to demand patterns where there are many zeroes (typically at least 50%), the dispersion or location of the zeroes does not show a particular pattern (i.e. random), and the non-zero values have a range of values without an apparent pattern.
When a statistician uses the term “random,” it indicates that assuming randomness is the best we can do given the information available and any discernable pattern that can be found in the current data. It does not mean there is no cause for a zero or non-zero. Rather, it is simply the best we can do right now and it is optimal to deploy methods that provide insight with this assumption.
How do we know if the assumption of random is reasonable for a given data set? The non-parametric statistical method called a run test is a powerful method (see Nonparametric Statistical Inference by Gibbons and Chakraborti). A run would be defined as a succession of 0s or non- zeroes data set. For example in the data set 0 0 0 0 0 1 1 1 1 1, there are 10 members and two runs. In the data set 0 1 0 1 0 1 0 1 0 1 there are 10 members and 10 runs. In the data set 1 1 1 0 0 0 1 0 0 1 there are 10 members and 4 runs. When the number of runs is too small or too large, then we conclude the data set is not random.
For our example we will assume the probability of getting a nonzero demand value is 20% and if there is demand, the possible values are 1, 2, or 3 (with equal probability, average of 2). If the total observations are 48, on average the number of nonzero cells will be 9.6 (=0.2*48) and the average demand value will be 0.4 = ((0.2 * 48 * 2)/48) = (0.2 * 2).
Example intermittent demand & “best estimates”
Table 4 has a randomly generated set of intermittent demands. How might we best estimate demand for each cell (year and month)?
Table 5 summarizes how “well” using zero as an estimate for each cell works. The actual demands are in rows 3 to 6, the estimated demand of zero is rows 8 to 11, and the error metric is in rows 13 to 16. The metric used is total absolute error. For each cell we calculate the absolute value of the actual value minus the estimate value, then sum across each year and each month. We see “zero” has a low forecast error – total of 21.
Table 6 summarizes how well using the average value (0.4) does. Its error metric value is 32.2.
Table 7 summarizes if how well using last year to estimate this year works, its metric is 34.
Observe that the estimate of “zero” works much better than the two alternative methods based on a standard forecast error metric. However, relying on the standard metric to identify the right forecast method will be disastrous to the firm. To understand this compare total actual demand versus total estimated demand. The “zero” method will instruct the firm to produce or acquire zero of this product. Note that the other two methods do much better at estimating the aggregate demand.