Presample Values for regARIMA Model Estimation

Presample data comes from time points before the beginning of the observation period. In Econometrics Toolbox™, you can specify your own presample data or use generated presample data.

Time series plot indicates the presample period by shading the section before the observation period.

In regression models with ARIMA errors, the distribution of the current innovation (ε_t) is conditional on historic information (H_t). Historic information can include past unconditional disturbances or past innovations, i.e., H_t = {u_{t –
1},ε_{t –
1},u_{t –
2},ε_{t –
2},...,u₀,ε₀,u_–1,ε_–1,...}. However, the software does not include past responses (y_t) nor past predictors (X_t) in H_t. For example, in a regression model with ARIMA(2,1,1) errors, you can write the error model in several ways:

$(1 - ϕ_{1} L - ϕ_{2} L^{2}) (1 - L) u_{t} = (1 + θ_{1} L) ε_{t} .$
$(1 - L - ϕ_{1} (L - L^{2}) - ϕ_{2} (L^{2} - L^{3})) u_{t} = (1 + θ_{1} L) ε_{t} .$
$u_{t} = u_{t - 1} + ϕ_{1} (u_{t - 1} - u_{t - 2}) + ϕ_{2} (u_{t - 2} - u_{t - 3}) + ε_{t} + θ_{1} ε_{t - 1} .$
$ε_{t} = u_{t} - u_{t - 1} - ϕ_{1} (u_{t - 1} - u_{t - 2}) - ϕ_{2} (u_{t - 2} - u_{t - 3}) - θ_{1} ε_{t - 1} .$

The last equation implies that:

The first innovation in the series (ε₁) depends on the history H₁ = {u_–2,u_–1,u₀,ε₀}. H₁ is not observable nor inferable from the regression model.
The second innovation in the series (ε₂) depends on the history H₂ = {u_–1,u₀,u₁,ε₁}. The software can infer u₁ and ε₁, but not the others.
The third innovation in the series (ε₃) depends on the history H₃ = {u₀,u₁,u₂,ε₂}. The software can infer u₁, u₂, and ε₁, but not u₀.
The rest of the innovations depend on inferable unconditional disturbances and innovations.

Therefore, the software requires three presample unconditional disturbances to initialize the autoregressive portion, and one presample innovation to initialize the moving average portion.

The degrees of the compound autoregressive and moving average polynomials determine the number of past unconditional disturbances and innovations that ε_t depends on. The compound autoregressive polynomial includes the seasonal and nonseasonal autoregressive, and seasonal and nonseasonal integration polynomials. The compound moving average polynomial includes the seasonal and nonseasonal moving average polynomials. In the example, the degree of the compound autoregressive polynomial is P = 3, and the degree of the moving average polynomial is Q = 1. Therefore, the software requires three presample unconditional disturbances and one presample innovation.

If you do not have presample values (or do not supply them), then, by default, the software backcasts for the necessary presample unconditional disturbances, and sets the necessary presample innovations to 0.

Another option to obtain presample unconditional disturbances is to partition the data set into a presample portion and estimation portion:

Partition the data such that the presample portion contains at least max(P,Q) observations. The software uses the most recent max(P,Q) observations and ignores the rest.
For the presample portion, regress y_t onto X_t.
Infer the residuals from the regression model. These are the presample unconditional disturbances.
Pass the presample unconditional disturbances (U0) and the estimation portion of the data into estimate.

This option results in a loss of sample size. Note that when comparing multiple models using likelihood-based measures of fit (such as likelihood ratio tests or information criteria), then the data must have the same estimation portions, and the presample portions must be of equal size.

If you plan on specifying presample values, then you must specify at least the number necessary to initialize the series.

You can specify both presample unconditional disturbances and innovations, one or the other, or neither.

Presample Values for regARIMA Model Estimation

Related Topics