Using the NAG Toolbox for MATLAB - Part 3

Previous articles in this series are here and here; in this note, we continue our exploration of the Toolbox, demonstrating how it allows users to call any NAG Library routine from within MATLAB, and use MATLAB's plotting facilities to view the results.

Note: The code examples in this article have been extracted from demo scripts, and will not necessarily work properly if cut and pasted from this page into MATLAB. The full version of the scripts, which have been used to make the figures in this article, are available in this archive.

The G13 chapter - Time series analysis

A time series is a set of observations of some time-dependent process, collected at various points in time. The G13 chapter of the NAG Library contains several routines for investigating and modelling the statistical structure of time series; the models constructed by these routines may then be used to better understand the data, or to create forecasts (i.e. predictions of future behaviour) from the series. For example, a so-called autoregressive integrated moving average (ARIMA) model can be fitted to the series - see below.

One way of initially characterising a time series is to calculate its autocorrelation function, which describes the correlation (or degree of dependence) that exists between the behaviour of the underlying process at different points in time. The separation between the different times is called the lag, and the autocorrelation function is usually expressed as a set of autocorrelation coefficients, for different values of the lag. The routine g13ab can be used to compute this, along with more elementary statistical quantities such as the mean and variance. Here's the code:

   
% Here's the time series data (this comes from 
% sunspot readings).
x = [5; 11; 16; 23; 36;
58; 29; 20; 10; 8;
3; 0; 0; 2; 11;
27; 47; 63; 60; 39;
28; 26; 22; 11; 21;
40; 78; 122; 103; 73;
47; 35; 11; 5; 16;
34; 70; 81; 111; 101;
73; 40; 20; 16; 5;
11; 22; 40; 60; 80.9];

% nk is the number of lags for which the autocorrelations
% are required.
nk = int32(40);

% Call the NAG routine.
[xm, xv, coeff, stat, ifail] = g13ab(x, nk);

Because lag is a discrete variable, the autocorrelation function is best displayed as a histogram (sometimes called an autocorrelogram in this context), as in this picture:

 
Figure 1: Computing the autocorrelation function of a time series.

The autocorrelation function contains both quantitative and qualitative information about the time-dependence of the underlying process; in this example, the period of the oscillations indicates a seasonality of around 11 units. In addition, the shape of the autocorrelation plot can be used to give some indication of suitable model parameters when fitting an ARIMA model to the time series. The curve should tail off quickly to zero; failure to do so, as in Figure 1, may indicate that the series is non-stationary, which necessitates further treatment. If the correlation is high for the first few lags and then quickly tails off, it suggests a so-called moving average (MA) series, whilst a sinusoidal shape is often associated with an autoregressive (AR) series. In many cases, a full ARIMA model (i.e. one with both AR and MA components) is required to fit the series.

Besides the autocorrelation function, additional insight may be obtained from a plot of the partial autocorrelation function; this can be produced by calling g13ac in place of g13ab in the code fragment above.

The MATLAB script for this demo is available as the file NAGToolboxDemos/Time_series_analysis/g13ab_demo.m, distributed in this archive.