It is often a challenge to de-bug code that involves large numbers of long stochastic series – it is very easy to think you have it right and not so easy to make sure. Lately I have needed to generate random correlated series whose means and covariance characteristics I know so I can verify various calculation procedures. I thought I would share a small function I wrote in R that generates the series.

I wanted to be able to provide a correlation matrix together with a set of means and standard deviations and get back a set of series as the columns of a matrix so many rows long.

Basic Procedure

The basic procedure is as follows:

  1. provide the following inputs:
    • The required series length (i.e. number of periods) defaulting to 1,000.
    • The correlation matrix defaulting to a 1 x 1 identity matrix.
    • A vector of means, one for each series, defaulting to 0.
    • A vector of standard deviations, one for each series, defaulting to 1.
  2. Check the inputs make sense or are interpretable in some meaningful way.
  3. Figure out from the dimensions of the correlation matrix how many series are requested.
  4. Build matrices for both the means and standard deviations that matches the series that will be generated (i.e. series length rows x number of series columns.
  5. Create a matrix filled with random numbers ~N(0,1) with series length rows and number of series columns.
  6. Complete a Cholesky decomposition of the correlation matrix and multiply (using matrix multiplication) by the random number matrix.
  7. This matrix, which has the desired correlation characteristics, is then scaled by the desired standard deviations and shifted by the desired means to get the final set of series to be returned.

Code

Here’s the code:

# Ian Rayner: 2011-02-07

# Generates a set (1, or dimension of user-supplied correlation matrix) of
# series of length len.series. If no mean or sd vectors are supplied, they
# are assumed to be 0 and 1 respectively.
# Checks correlation matrix is symmetric and all values are between [-1, +1].
# Checks that the means and sds vectors, if supplied match the dimensions
# of the correlation matrix.

CorrelatedRandomSeries <- function(len.series=1000, cor.matrix=matrix(1,nrow=1), means=0, sds=1){

# Input validation
  if (!identical(cor.matrix, t(cor.matrix)))
    stop(“Correlation matrix should be symmetric”)
  if ((max(cor.matrix) > 1) | (min(cor.matrix) < -1))
    stop(“All values in correlation matrix must be on [-1, +1]”)
  if ((length(means) != dim(cor.matrix)[1]) & (length(means) != 1))
    stop(“Means must match correlation matrix dimensions”)
  if ((length(sds) != dim(cor.matrix)[1]) & (length(sds) != 1))
    stop(“Standard deviations must match correlation matrix dimensions”)

# Infer number of series from correlation matrix.
  num.series <- dim(cor.matrix)[1]

# If single valued, simply repeat for entire series.
  means <- matrix(means, nrow=len.series, ncol=num.series, byrow=T)
  sds <- matrix(sds, nrow=len.series, ncol=num.series, byrow=T)

# Create random series ~N(0,1)
  series <- matrix(rnorm(num.series * len.series, 0, 1), ncol=num.series)

# Use Cholesky decomposition to build correlation then scale and shift by means / sds
  return(sds*(series %*% chol(cor.matrix)) + means)
}

Aside: Cholesky Decomposition

I searched high and low for proof that the Cholesky decomposition of a n x n correlation matrix (ρ) transforms an array of n random vectors (~N(0,1)) into a set of vectors with the correlation coefficients specified in ρ. I couldn’t find one, so here’s my version of the proof:

\boldsymbol{\rho } \textup{ is a correlation matrix with elements } \rho_{i,j}
\textup{ \textbf{L} is a lower triangular matrix with elements L}_{i,j} \textup{, such that } \boldsymbol{\rho } \textup{ = \textbf{LL}}^{T}
\textup{i.e. \textbf{L} is the Cholesky decomposition of } \boldsymbol{\rho }
\begin{pmatrix} \rho_{1,1} & \rho_{1,2} & . & \rho_{1,n}\\ \rho_{2,1} & \rho_{2,2} & . & \rho_{2,n}\\ . & . & . & .\\ \rho_{n,1} & \rho_{n,2} & . & \rho_{n,n} \end{pmatrix} = \begin{pmatrix}L_{1,1} & 0 & . & 0\\ L_{2,1} & L_{2,2} & . & 0\\ . & . & . & .\\ L_{n,1} & L_{n,2} & . & L_{n,n} \end{pmatrix} \begin{pmatrix} L_{1,1} & L_{2,1} & . & L_{n,1}\\ 0 & L_{2,2} & . & L_{n,2}\\ . & . & . & .\\ 0 & 0 & . & L_{n,n} \end{pmatrix}
\textup{ =} \begin{pmatrix} L_{1,1}.L_{1,1} & L_{1,1}.L_{2,1} & . & L_{1,1}.L_{n,1}\\ L_{2,1}.L_{1,1} & L_{2,1}.L_{2,1} + L_{2,2}.L_{2,2} & . & L_{2,1}.L_{n,1} + L_{2,2}.L_{n,2}\\ . & . & . & .\\ L_{n,1}.L_{1,1} & L_{n,1}.L_{2,1} + L_{n,2}.L_{2,2} & . & L_{n,1}.L_{n,1} + L_{n,2}.L_{n,2} + . + L_{n,n}.L_{n,n} \end{pmatrix}
\textup{Summarizing:  } \rho_{i,j} = \rho_{j ,i} = \sum_{k = 1}^{k = i}L_{i,k}L_{j,k} \textup{   } \forall \textup{   } i \leq j
\textbf{Z} \textup{ is a column vector of independent random variables } \sim \textit{N} \left ( 0, 1 \right )
\textup{The ith element of \textbf{Z} is Z}_{i}
\textup{Define \textbf{X} = \textbf{LZ}.  The ith element of  \textbf{X} is X}_{i}
X_{i} = L_{i,1}Z_{1} + L_{i,2}Z_{2} + . + L_{i,i}Z_{i}
X_{j} = L_{j,1}Z_{1} + L_{j,2}Z_{2} + . + L_{j,j}Z_{j}
\textup{The correlation coefficient of } X_{i} \textup{ and } X_{j} \textup{, } i \leq  j \textup{, is}
\rho _{i,j} = X_{i}.X_{j} \textup{ and } Z_{i}.Z_{j} = \delta _{i,j} \textup{ (because Z}_{i} \textup{ are uncorrelated with variance of 1) therefore, }
\rho _{i,j} = L_{i,1}L_{j,1} + L_{i,2}L_{j,2} + . + L_{i,i}L_{i,i} = \sum_{k = 1}^{k = i}L_{i,k}L_{j,k}

which is the same as above!

Edit: Fixed LaTeX rendering problem.

New Commodity Pool Launches

Please provide your name and email address so we can send you our quarterly compilation of new commodity pools registered with NFA.

We hate SPAM and promise to keep your email address safe.

Thank you. Your file will be available after you confirm your subscription. Check your in-box!

Biggest Hedge Funds By $AUM

Please provide your name and email address so we can send you our quarterly compilation of biggest hedge funds by $AUM as reported on the SEC's Form ADV.

We hate SPAM and promise to keep your email address safe.

Thank you. Your file will be available after you confirm your subscription. Check your in-box!

Free Hedge Fund ODD Toolkit

Note this is US Letter size. If you want A4, use the other button!

We hate SPAM and promise to keep your email address safe.

Thank you. Your file will be available after you confirm your subscription. Check your in-box!

Free Hedge Fund ODD Toolkit

Note this is A4 size. If you want US Letter, use the other button!

We hate SPAM and promise to keep your email address safe.

Thank you. Your file will be available after you confirm your subscription. Check your in-box!

Subscribe:

Don't miss our next hedge fund article!

We hate SPAM and promise to keep your email address safe.

Thank you! Check your email for confirmation message.

Share This

Share

If you found this post informative, please share it!