It is often a challenge to de-bug code that involves large numbers of long stochastic series – it is very easy to think you have it right and not so easy to make sure. Lately I have needed to generate random correlated series whose means and covariance characteristics I know so I can verify various calculation procedures. I thought I would share a small function I wrote in R that generates the series.
I wanted to be able to provide a correlation matrix together with a set of means and standard deviations and get back a set of series as the columns of a matrix so many rows long.
The basic procedure is as follows:
- provide the following inputs:
- The required series length (i.e. number of periods) defaulting to 1,000.
- The correlation matrix defaulting to a 1 x 1 identity matrix.
- A vector of means, one for each series, defaulting to 0.
- A vector of standard deviations, one for each series, defaulting to 1.
- Check the inputs make sense or are interpretable in some meaningful way.
- Figure out from the dimensions of the correlation matrix how many series are requested.
- Build matrices for both the means and standard deviations that matches the series that will be generated (i.e. series length rows x number of series columns.
- Create a matrix filled with random numbers ~N(0,1) with series length rows and number of series columns.
- Complete a Cholesky decomposition of the correlation matrix and multiply (using matrix multiplication) by the random number matrix.
- This matrix, which has the desired correlation characteristics, is then scaled by the desired standard deviations and shifted by the desired means to get the final set of series to be returned.
Here’s the code:
Aside: Cholesky Decomposition
I searched high and low for proof that the Cholesky decomposition of a n x n correlation matrix () transforms an array of n random vectors (~N(0,1)) into a set of vectors with the correlation coefficients specified in . I couldn’t find one, so here’s my version of the proof:
which is the same as above!
Edit: Fixed LaTeX rendering problem.