Skip to content
Search

Demand Planning and Forecasting

Demand planning translates market signals into an operational volume forecast used to drive supply, inventory, labor, and capacity decisions. A good demand plan is not the most accurate forecast — it’s the forecast that minimizes total system cost (stockouts + excess inventory + expediting + overtime) across all planning horizons.

MethodBest ForLimitation
Simple moving averageStable, no trend/seasonalityLags trends; treats all periods equally
Weighted moving averageStable with slight trendManual weight-setting
Exponential smoothing (SES)Stable demand, self-correctingNo trend or seasonality
Holt’s methodDemand with trendNo seasonality
Holt-Winters (Triple ES)Trend + seasonalityRequires 2+ years of history
Regression (causal)Demand driven by external variable (price, weather, GDP)Requires leading indicator data
ML / AI (LSTM, XGBoost)Complex patterns, large SKU countsBlack box; requires data maturity

Most planning systems use Holt-Winters or a variant as the statistical baseline for replenishment SKUs, then layer in judgment adjustments.

MetricFormulaNotes
MAPEMean |Actual − Forecast| / Actual × 100Intuitive but biased toward low-volume SKUs
WMAPEΣ|Actual − Forecast| / Σ Actual × 100Preferred — volume-weighted, not distorted by small SKUs
MADMean |Actual − Forecast|Same units as demand; useful for operations
BiasMean (Forecast − Actual)Positive = over-forecast; negative = under-forecast
Tracking signalCumulative forecast error / MAD>±4 signals systematic bias; trigger model review

WERC best-in-class benchmark: WMAPE < 15% at the SKU/week level. Most organizations operate at 20–35% WMAPE; top performers in stable categories reach 8–12%.

Segments the SKU portfolio to assign appropriate forecasting methods and safety stock policies:

A (high volume)B (medium)C (low volume)
X (stable)Statistical model; tight safety stockStatistical modelStatistical model; moderate buffer
Y (variable)Statistical + market inputStatistical + judgmentStatistical; wide buffer
Z (erratic/intermittent)Consensus-heavy; min/maxJudgment-basedMake-to-order or zero stock

C/Z SKUs (low volume, erratic) are the biggest source of forecast error and excess inventory. Policies matter more than forecasting algorithms for this segment.

The statistical baseline is an input, not the output. The consensus process layers in:

  1. Statistical baseline: System-generated, SKU/week level
  2. Sales input: Promotional events, key account changes, new business pipeline
  3. Marketing input: Trade spend, pricing changes, advertising calendars
  4. Product management: New launches, end-of-life, portfolio changes
  5. Consensus adjustment: Final one-number plan, with assumption log

Assumption logging is critical — if the forecast misses, you need to know which assumption failed.

No history exists for new products. Methods:

  • Analogue: Map to the lifecycle curve of a similar past product; scale by relative pricing, distribution, and marketing support
  • Market research: Consumer/customer surveys; pilot/test market data
  • Management judgment: Expert opinion with structured uncertainty ranges (P10/P50/P90)

New product forecast error rates of 40–70% in the first 6 months are common. Plan for this with flexible supply capacity and cautious initial inventory commitments.

Short-horizon demand sensing (1–14 days) supplements the statistical forecast with high-frequency signals:

  • POS data from retailers (daily sell-through at store level)
  • Customer order patterns (order frequency, order size trends)
  • Warehouse shipment actuals vs. plan

Demand sensing tools (Blue Yonder, o9, ToolsGroup) improve short-horizon MAPE by 30–50% vs. unconstrained statistical models, directly reducing safety stock requirements at the DC level.

Collaborative Planning, Forecasting and Replenishment — a retailer-supplier process where:

  • Retailer shares POS data and promotional calendars
  • Supplier shares production constraints and lead times
  • Joint forecast is agreed and exception-managed

CPFR is most effective with top-5 retail accounts. Standard protocol defined by GS1/VICS. Requires EDI or portal connectivity and sustained account management attention.

BiasManifestationMitigation
Optimism biasSales always over-forecasts new productsSeparate demand forecast from sales quota
SandbaggingSales under-forecasts to manage expectationsAccountability to forecast accuracy metric
AnchoringForecast snaps to last year’s numberStatistical override with documented exceptions
Hockey stickBack-loads volume to end of periodPeriod-level shape analysis

Intermittent Demand Methods (Course 4.5 Depth)

Section titled “Intermittent Demand Methods (Course 4.5 Depth)”

Standard exponential smoothing and Holt-Winters fail for intermittent demand — items that have frequent zero-demand periods interspersed with irregular positive demands (spare parts, industrial MRO, slow-moving B2B SKUs). Three dedicated methods:

Syntetos-Boylan Demand Classification Matrix

Section titled “Syntetos-Boylan Demand Classification Matrix”

Before choosing a forecasting method, classify the demand pattern using two dimensions:

  • ADI (Average Demand Interval): average number of periods between non-zero demand occurrences. ADI > 1.32 = intermittent.
  • CV² (squared coefficient of variation of non-zero demand sizes): measures how variable the demand quantity is when it does occur. CV² > 0.49 = erratic.
Low CV² (≤0.49)High CV² (>0.49)
Low ADI (≤1.32)Smooth — use SES/Holt-WintersErratic — use Holt with wide buffer
High ADI (>1.32)Intermittent — use CrostonLumpy — use TSB

Developed by J.D. Croston for spare parts forecasting. Croston decomposes intermittent demand into two separate exponential smoothing processes:

  • (average demand size when demand occurs): updated only in periods with non-zero demand
  • (average inter-demand interval): updated only in periods with non-zero demand

Forecast = D̄ / T̄ (average demand per period)

Croston is 15–30% more accurate than exponential smoothing for intermittent demand items. The weakness: Croston’s estimator has a positive bias — it systematically overestimates demand.

SBA corrects Croston’s positive bias with a simple adjustment:

F_SBA = (1 - α/2) × D̄/T̄

Where α is the smoothing parameter. The (1 - α/2) term deflates the Croston estimate, removing the systematic upward bias. SBA is the preferred method for intermittent demand in most planning software implementations because it is unbiased and computationally trivial.

TSB takes a different approach: instead of smoothing demand size and demand interval separately, TSB smooths the probability of demand occurring each period plus the expected demand size. This makes TSB adaptive to demand pattern changes — particularly useful for spare parts late in a product’s lifecycle when demand is declining toward obsolescence. When demand probability falls below a threshold, TSB naturally drives the forecast to zero, enabling proactive write-off decisions rather than waiting for negative inventory cycles.

Best use case: slow-moving spare parts with lifecycle dynamics; any item where you expect demand to eventually reach zero and need the forecast to adapt proactively.

Hierarchical Forecasting and MinT Reconciliation

Section titled “Hierarchical Forecasting and MinT Reconciliation”

Most planning systems forecast at multiple levels simultaneously: total company → business unit → product family → SKU → SKU/location. The forecasts generated at different levels are often inconsistent — SKU-level forecasts don’t sum to the family-level forecast, which doesn’t sum to the total.

Reconciliation methods:

  • Top-down: Distribute the aggregate forecast proportionally to lower levels using historical shares. Simple but propagates aggregate errors.
  • Bottom-up: Sum SKU-level forecasts upward. Captures SKU-level variation but can miss macro patterns.
  • Middle-out: Forecast at product-family level; distribute down and aggregate up. Balances accuracy levels.
  • MinT (Minimum Trace reconciliation): Simultaneously reconciles all levels using optimal linear combinations that minimize the trace of the forecast error covariance matrix. Mathematically derives the best combination of forecasts across all levels.

MinT results from academic literature: MAPE reduction of 5.5–30.2% across test cases, averaging 14.9% improvement over non-reconciled forecasts. The improvement is largest when lower-level forecasts are noisy and aggregate-level forecasts are more stable. Commercial implementations: o9, SAP IBP (hierarchical reconciliation module).

The M5 Competition (2020, Makridakis et al.) remains the most rigorous public benchmark of forecasting methods, using 42,840 time series from Walmart’s retail data.

Key findings:

  • LightGBM and gradient boosting ensemble methods dominated — winning solutions used gradient boosted trees with rich feature engineering over statistical methods
  • ML beats statistical when: high-dimensional feature space (price elasticity, promotions, calendar events), large portfolio (thousands of SKUs where cross-SKU patterns exist), and sufficient data volume (2+ years of weekly history)
  • Statistical methods remain competitive for: single-item forecasting with limited history, interpretability requirements, environments without data science capability

Practical implication: for a company with 10,000+ SKUs, retail POS data, and a planning team that includes data scientists, ML forecasting is worth the investment. For a 500-SKU B2B distributor without data science capability, Holt-Winters with Croston for slow movers is the pragmatic choice.

FVA (developed by Michael Gilliland at SAS) measures whether each step in the consensus forecasting process improves or degrades accuracy relative to the previous step.

FVA calculation: Compare MAPE of the adjusted forecast vs. MAPE of the unadjusted baseline.

  • Positive FVA: the adjustment improved accuracy — it added value
  • Negative FVA: the adjustment made the forecast worse — it destroyed value
  • Zero FVA: the adjustment made no difference

The benchmark for any manual step: does it beat a naïve forecast (last period’s actual)? If your consensus process can’t beat naïve forecasting, you have a serious process problem.

What FVA typically reveals: analyst overrides frequently produce negative FVA — the human judgment made the forecast less accurate than the statistical baseline. This is not a criticism of analysts; it reflects anchoring bias, optimism bias, and the hockey-stick phenomenon. FVA analysis makes this visible for the first time in most organizations that run it.

FVA should be measured at every step: statistical baseline → sales input → marketing input → management override → final consensus. Steps with consistently negative FVA should be eliminated or restructured.

Demand sensing supplements the statistical forecast in the 0–8 week window using high-frequency signals unavailable to monthly forecasting processes:

  • POS (Point of Sale) data: daily sell-through by store and SKU — actual consumer demand, not retailer orders
  • EDI order patterns: customer order frequency and size trends in the current week
  • Social signals: search trend data, social media volume as leading indicators for specific categories
  • Weather data: for weather-sensitive categories (beverages, ice cream, seasonal apparel)

Demand sensing improvement: 5–20% accuracy improvement over conventional methods in the 0–8 week window. The improvement is larger when conventional forecasting relies on orders (which lag POS by 1–3 weeks) rather than POS data directly.

Commercial implementations: Blue Yonder Luminate Demand Sensing, o9 Demand Sensing, ToolsGroup SO99+.

New Product Forecasting: The Analogue Method in Practice

Section titled “New Product Forecasting: The Analogue Method in Practice”

The analogue method maps a new product to the lifecycle curve of a comparable past product. The key is defining “comparable” rigorously:

  1. Select analogues based on: category similarity, price tier, channel overlap, marketing support level
  2. Scale the analogue curve by: relative distribution breadth, relative advertising spend, relative price premium/discount
  3. Define scenario range: P10 (pessimistic), P50 (base), P90 (optimistic) from the spread of comparable analogues
  4. Plan supply flexibility against P50 but communicate the P10/P90 range to procurement so capacity decisions account for the full range

New product forecast error of 40–70% in the first 6 months is normal. The goal is not accuracy — it’s setting supply flexibility correctly and not over-committing inventory to a point forecast that has a ±50% uncertainty band.

Standard content

Continue reading with Standard

This article is part of our Standard library — written from real projects, not generic explainers.

  • Full Standard tier vault — automation, intralogistics, supply chain, more
  • Practitioner-level guidance from real projects
  • Unlimited AI questions across the Standard corpus

$19/mo Standard · $25/mo Pro · cancel anytime