*This information is imported from Forecast PRO reference document.
What is Statistical Forecasting?
Everybody forecasts, whether they know it or not. Businesses need to forecast future events to plan production, schedule their work force, or prepare even the simplest business plan.
Most business forecasting is still judgmental and intuitive. Sometimes this is appropriate. People must integrate information from a large variety of sources—qualitative and quantitative—and this is probably best done by using the extraordinary pattern recognition capabilities of the human brain. Unfortunately, many companies also use judgmental forecasting where they should not.
Not everyone understands the concept of forecasting. It tends to get mixed up with goal setting. If a company asks its salespeople to forecast sales for their territories, these “forecasts” often become the yardsticks by which they are judged.
The main advantage of statistical forecasting is that it separates the process of forecasting from that of goal setting and makes it systematic and objective. Statistical forecasting can help almost any business improve planning and performance. There is, in other words, value added for a business.
The future is uncertain, and this uncertainty can be represented quantitatively. Statistical forecasting represents uncertainty via a probability distribution. A probability distribution associates each possible outcome with a likelihood of it occurring. Two kinds of information are needed to describe the distribution: the point forecasts, which is essentially the “best guess” estimate, and the confidence limit, which captures how much uncertainty there is around the point forecast. The upper and lower confidence limits represent reasonable bounds for the forecast. You can be reasonably confident that the actual outcome will fall within the confidence limits.
Forecast Pro depicts this information graphically, as well as numerically. In the graph below, the red line represents the point forecast, while the blue lines represent the upper and lower confidence limits.
The upper confidence limit is often calibrated to the ninety-fifth percentile. This means that the actual value should fall at or below the upper confidence limit about 95% of the time. You can set the percentiles of both the upper and lower confidence limits. Sometimes, the upper confidence limit will be more useful for planning than the point forecast.
Let’s illustrate this idea with an example. Suppose you were in charge of forecasting widget sales for your company. If you wanted to determine expected revenues for next month, you would be most interested in the point forecast, since it is the mean value of the distribution. The point forecast gives you the minimum expected forecast error.
On the other hand, suppose you wanted to know how many widgets to produce. If you overproduce, warehousing costs will be excessive. But if you under-produce, you will probably lose sales. Since the cost of lost sales is usually greater than the cost of overstocking, you will be most interested in the upper confidence limit. The 95% upper confidence limit tells you how many widgets to produce to limit the chance of “stocking out” to less than 5%.
Forecasting Methodologies
A wide variety of statistical forecasting techniques are available, ranging from very simple to very sophisticated. All of them try to capture the statistical distribution that we have just discussed.
Forecast Pro offers the forecasting methodologies that have been proven to be the most appropriate for business forecasting:
– simple moving averages,
– discrete data models (Poisson or negative binomial),
– curve fitting,
– Croston’s intermittent demand model,
– exponential smoothing,
– Box-Jenkins,
– Bass diffusion model,
– forecasting by analogy,
– dynamic regression,
– event models and
– multiple-level forecasting.
The latest version of Forecast Pro also includes extreme gradient boosting, a machine learning methodology that has recently emerged as a very effective for demand forecasting.
All these methodologies forecast the future by fitting quantitative models to statistical patterns from the past. Therefore, you must have historic records of your variables, preferably for several years.
Forecast accuracy depends upon the degree to which statistical data patterns exist and their stability over time. The more regular the series, the more accurate the forecasts are.
Six of the methodologies are uni-variate techniques. They forecast the future entirely from statistical patterns in the past.
The simple moving average is widely used in business, mostly because it is so easy to implement.
However, it is only appropriate for very short or very irregular data sets, where statistical features like trend and seasonality cannot be meaningfully determined.
Discrete data models are used for data consisting of small whole numbers. These models are characteristically used to model a slow-moving item for which most orders are for only one piece at a time. Forecasts are non-trended and nonseasonal.
Croston’s intermittent demand model is not a widely known or used technique, but it can be extremely useful. It is usually used to model data in which a significant number of periods have zero demand but the non-zero orders may be substantial. This is characteristic of a slow-moving item which is ordered to restock a downstream inventory. Forecasts are non-trended and nonseasonal.
Exponential smoothing models are widely applicable. They are also widely used because of their simplicity, accuracy and ease of use. Their robustness makes them ideal even when the data are short and/or volatile. Exponential smoothing models estimate trend and seasonality and extrapolate them forward.
Box-Jenkins is a more elaborate statistical method than exponential smoothing. Box-Jenkins models estimate the historic correlations of the data and extrapolate them forward. It often outperforms exponential smoothing in cases when the data are fairly long and nonvolatile. However, it may not perform well when the data are unstable.
Machine learning is available as a uni-variate model and may be included in the expert selection algorithm, if desired. Like exponential smoothing and Box-Jenkins, machine learning uni-variate models leverage information inherent in the historic data (seasonality, trend) to create features. Using XGBoost, an extreme gradient boosting algorithm, these features are used to create an ensemble of decision trees to generate forecasts. Forecast Pro’s automatic machine learning model leverages our AI-driven expert selection algorithm to select features and an extreme gradient boosting tree model specification appropriate for your data.
Forecast Pro’s expert selection automatically chooses the appropriate uni-variate forecasting technique for each item forecasted. By default, expert selection will determine which of the first five uni-variate models is most appropriate for the data, optimize the model and create the forecasts. If desired, you may include machine learning in expert selection as well. Alternatively, you can dictate that a specific method be used and customize your models. Forecast Pro provides extensive diagnostics and statistical tests to help you make informed decisions.
Forecast Pro includes six additional forecasting techniques that are not considered in expert selection:
– event models,
– custom component models,
– forecast by analogy,
– the Bass diffusion model,
– dynamic regression
– and custom machine learning models.
Event models are extensions of exponential smoothing models that allow you to capture responses to promotions, business interruptions and other a-periodic events. These models allow you to assign each period into logical categories and incorporate an adjustment for each category. For example, if you establish a category for promoted periods then your model would include an adjustment for promoted periods. If you ran three different types of promotions, you could establish three categories and have a different adjustment for each type of promotion.
Custom component models are also extensions of exponential smoothing models. The method generates statistical forecasts for the different components found in an exponential smoothing model (sales level, trend, seasonal pattern and events) and then allows you to customize any of the estimated components.
The model is very useful in circumstances where not all the components can be accurately estimated from the demand history. Examples include short data sets where the seasonal pattern cannot be reliably estimated and you wish to use a seasonal pattern from a similar product, forecasting the impact of future events that have not occurred historically, tempering the trend for longer-term forecasts, etc.
Forecast by analogy is a new product forecasting technique that allows you to create a forecast that “looks like” a different product’s demand pattern or a launch profile that you create.
The Bass diffusion model is a new product forecasting technique designed to forecast the spread of a new product based on the adoption rates of two types of users—innovators who are driven by their desire to try new products and imitators who are primarily influenced by the behavior of their peers.
Dynamic regression produces a forecast based on the forecasted item’s history (like uni-variate methods) and that of the explanatory variables (e.g., product promotion, advertising, demographic variables, or macroeconomic indicators). You must provide historic values for the variable to be forecast and the explanatory variables.
Forecast Pro can provide automated selection for the dynamic terms (e.g., lagged dependent variables and Cochrane-Orcutt terms), but the user needs to specify which explanatory variables to include in the model. Forecast Pro provides several test batteries and diagnostics for variable selection and gives you specific advice on how to improve the model. Building a dynamic regression model thus consists of deciding which variables to consider, and following the program’s advice, step-by-step, to identify your final model.
Dynamic regression can outperform exponential smoothing and Box-Jenkins in cases where strong explanatory variables exist, and you have reasonably accurate forecasts for them. Unfortunately, this is not always the case, and in those instances the forecasts may not be as accurate as those from uni-variate methods.
Custom Machine learning models may also include event schedules and explanatory variables as features in a boosted decision tree ensemble model. As with event models and dynamic regression, you must provide historic values for events and explanatory variables.
Like Forecast Pro’s automatic machine learning models, our custom machine learning methodology uses XGBoost, an extreme gradient boosting algorithm, to create an ensemble of decision trees.
Forecast Pro’s machine learning algorithm will automatically select the most relevant features from both the user provided ones and the Forecast Pro generated features. You can automatically train custom machine learning models using the same AI-driven logic that powers Forecast Pro’s automatic machine learning uni-variate models, or you can specify a specific boosted decision tree structure.
If you are new to forecasting and these techniques seem a little intimidating, don’t worry. We designed Forecast Pro to guide you completely through the forecasting process. Just follow the program’s advice and you will soon be generating accurate forecasts and adding value to your business.
Some Forecasting Tips
Forecast Pro uses your data history to forecast the future, so it is extremely important that your data be as accurate and complete as possible. Keep in mind the rule, “garbage in, garbage out!”
You will also want to give some thought to what data you should forecast. If you want to forecast demand for your product, you should probably input and forecast incoming orders rather than shipments, which are subject to production delays, warehousing effects, labor scheduling, etc. Many corporations are making large investments to obtain data as close to true demand as possible.
The more data you can supply the program, the better! The program can work with as few as five data points, but the forecasts from very short series are simplistic and less accurate. Although collecting additional data may require some effort, it is usually worth it.
If your data are seasonal, it is particularly important that you have adequate data length. The automatic model selection algorithms in Forecast Pro will not consider seasonal models unless you have ‘at least two years’ worth of data. This is because you need at least two samples for each month or quarter to distinguish seasonality from one-time irregular patterns. Ideally, you should use three or more years of data to build a seasonal model.