This is a post in a series on the basics of Monte Carlo simulations. I introduced MC integration using it to estimate π. Then I took a look at the way the convergence characteristics of the estimate depend upon the number of trials.
In my examples I was essentially using MC to do an integration: I have a search space and I sprinkle points over it and add up how many are under the curve. I know the size of the area over which I am sprinkling so I can use the ratio of points under the curve to total points to estimate the area under the curve. In this post, I want to address another aspect of convergence: what is the relationship between the accuracy of the integration and the fraction of total area taken up by the area of interest?
i.e. if the area of interest only occupies 25% of the total area, what difference does that make to the choice of the number of points to use compared to when the area of interest occupies, say 75%?
In the previous posts I estimated pi by integrating the area of a quarter circle contained inside a square. That example is not useful here because I cannot get a quarter circle to occupy 100% of a square! So, this time I am going to estimate the area of a square by integrating a step function of size s <= 1, contained in a unit square (chart on right).
We are estimating the fraction of the unit square occupied by the area under the step function (itself a square of side ‘s’). We are using a Bernoulli process: each point we randomly drop on the unit square falls under the step function (i.e. inside the smaller square) with probability p = s^2. We are estimating p by counting the fraction of points that land under the step function.
If we use the normal approximation to the binomial distribution, then
Let’s express this as a percentage of p: we generally want our accuracy to bear some relationship to the size of the thing we are measuring (in this case, s^2 which equals p).
Thus we can expect that if p is small (i.e. the pink square is small relative to the unit square), our error in estimating s^2 will approach infinity. If p is large (close to 1, the pink square occupies all of the unit square), our error in estimating s^2 approaches zero.
The following charts are the result of running 1000 MC simulations of 1000 trials each and plotting the actual error in the estimate of s^2 and the expected error vs the percent of the unit square occupied by the step function. The results are presented as absolute error and % error:
To pay proper respect to the memory of Jacob Bernoulli, the following two charts show the same thing without using the normal approximation. The jaggedness is because we are working with quantiles:
The conclusion we can draw from this exercise is that if more of the space we are testing is filled by the function we are estimating, then our estimate will be more accurate. In terms of designing a MC simulation, we turn this around and conclude that we can run fewer trials if the function fills the space and still meet our target accuracy.