Here’s a useful little utility: it is an R mixture distribution function. I needed a basic version of this for investigating some of Taleb’s ideas, and decided to put together an easy-to-use general version.
R Mixture Distribution Function: Code
# Mixture distribution rMixDist <- function(n=1000, DISTs=c(rnorm, rnorm), pars=list(c("mean"=0, "sd"=1), c("mean"=0, "sd"=2)), probs=c(0.99, 0.01)){ # Ian Rayner 2016-06 if (sum(probs) != 1) stop("Probabilities should sum to 1") if (any(sapply(list(DISTs, pars), length) - length(probs) != 0)) stop("Length of DISTs, pars and probs must match") counts <- table(factor(sample(1:length(probs), n, T, prob=probs), levels=1:length(probs))) popSamp <- c() for (i in 1:length(counts)){ argList <- as.list(c("n"=unname(counts[i]), pars[[i]])) popSamp <- c(popSamp, do.call(DISTs[[i]], argList)) } return(popSamp[sample(n, n)]) }
Using the typical R notation, the function generates a random sample according to the mixture parameters passed to the function.
Variable | Definition | Default | Notes |
---|---|---|---|
n | Sample size | 1000 | |
DISTs | Types of distribution | 2 distributions, both ~N(.,.) | Any number. Any type. |
pars | Distribution parameters | ~N(0,1) and ~N(0,2) | List of vectors of distribution parameters. |
probs | Distribution probability | 99% from ~N(0,1), 1% from ~N(0,2) |
If the user enters a set of probabilities that do not add up to one, or a set of distributions, parameters, and probabilities that are not the same in number, the function throws an error.
R Mixture Distribution Function: Usage Example
If we want to generate a sample with 100,000 elements that is a mixture of two normal distributions (~N(1, 0.1) and ~N(2, 0.2)) and a log normal distribution ~LogN(0, 0.3) in a ratio 80:17:3, we would call the following function:
rMixDist(100000, DISTs=c(rnorm, rnorm, rlnorm), pars=list(c(1, 0.1), c(2, 0.2), c(0, 0.3)), probs=c(0.8, 0.17, 0.03))
This results in the following sample:
Enjoy!
Share
If you found this post informative, please share it!