This will be the final Monte Carlo related post for a while. Good job, really, I can’t keep coming up with titles! I am hoping this one will pull together some of the the themes nicely and show a really cool way of implementing MC integrations using a procedure called “stratification”.
I wanted to show the process by animating at least one of the charts in this post, but after hours of reading about the R “animation” package and trying to figure out how to use SWFTools, I gave up! If any kind soul wants to help me figure out how to animate a series of charts produced in R, don’t hold back! I will put up the R code I used to generate the charts in this post – feel free to download and experiment with it.
In my previous posts, we have seen that
- The absolute error of an integration depends upon the number of trials
- The absolute error depends upon the degree to which the integrand fills the space bounding the simulation
- We have the flexibility to change the size and shape of the space bounding the simulation if it suits us
Stratification makes use of all these ideas.
My objective is to complete an integration to within a desired accuracy. If I knew ahead of time the answer to the integration I would know exactly how many trials to run. One strategy might be to go through a process of successive rounds of simulation gradually increasing the trials until I hit my target. Better yet, I could adopt a strategy that retains the value of each trial and focuses additional trials only where needed.
I want to calculate the value of pi by integrating the area under a quarter circle contained inside a square. I want the answer to be sufficiently accurate that two times the standard error in the estimate of the area is less than 0.1% of the area estimated.
I set the number of trials at 500. Using the knowledge that I am at liberty to change the size and shape of the sampling distribution, I start by dividing the space in half vertically and integrating the areas under the part of the curve contained in each half. I add the two integrals together and sum the squares of the estimated errors in each integral, take the square root and I have an estimate of the error in my total area.
Then I divide the space horizontally and repeat the process. Even though, due to the symmetry of this case, it makes no difference which way I divide the area, because of the stochastic nature of the method one of the answers will have a lower absolute error than the other. I select this answer and discard the other. I now have results for the total area divided into two strata. More generally, the choice will depend on the shape of the curve – it will be driven by the degree to which the integrand “fills” each stratum in the pair. 50% full has the highest absolute error.
If the total absolute error is sufficiently small that my accuracy criterion is met, I stop right there.
If not, I select the stratum with the worst absolute error. I divide it in two, first horizontally and then vertically. I run 500 trials on each of the two pairs of strata. I estimate the absolute errors for each pair and add them up (root sum of the squares). I discard the pair of strata with the worst error and retain the other pair. By loading 500 trials into a smaller area (half as big as before) I have doubled the density of my trials – I am getting a level of accuracy as if I had started out using 1000 trials. Adding the areas and error from these two strata and the undivided stratum from before I now have an updated estimate of the total area and total absolute error.
If the total absolute error is sufficiently small that my accuracy criterion is met, I stop right there. Otherwise I repeat the process again and again until I have achieved my target error!
The three charts show the process completed different ways for comparison:
- 500 trials 0.1% accuracy
- 1000 trials 0.1% accuracy
- 500 trials 0.01% accuracy
First just note the way the area is divided up into rectangles: the darker the rectangle the larger the number of trials (each dot is a trial, every fourth trial is plotted to compensate for the repetition). The procedure has resulted in all the work being done in the area surrounding the curve – the procedure wasted very little effort on the strata above and below the curve (the big empty-looking spaces). Why? Because the absolute error when a stratum is below the curve (100% full) or above the curve (100% empty) approaches zero.

Second, note what happens when 1000 trials is used instead of 500: there is less stratification, the strata look generally bigger. The procedure achieves the required level of accuracy without having to sub-divide (stratify) the space to the same degree. The trade-off is that many more trials are “wasted” on areas away from the vicinity of the curve (where the information is).

Finally, note what happens when I demand greater accuracy of my 500 trial simulations: the area of interest is stratified to a much greater degree. I never thought to measure this effect, but I would guess that the stratification goes 4 times deeper to get half the error.

I wish I could figure out the animation because it is very cool to watch the algorithm work!
I explored the trade-offs between trials, accuracy and time by measuring the average time to complete an integration using 100 – 1000 trials and 0.1% to 0.01% accuracy. I did not find it particularly informative. I think the curve you are trying to integrate might have a big effect. I expect there to be an optimum trial size resulting from the excessive stratification when “trials” is too small, vs wasted effort when “trials” is too large. I am not sure I really saw that, though about 500 trials seemed to go much quicker than 100 and a little quicker than 1000. More work!
Following is the R code. The one little wrinkle I introduced was to set the error to 100% if ever less than 5% or more than 95% of the points in a simulation fall under the curve. With so few points sitting on one side of the curve, it is not really appropriate to use sqrt(p.(1-p)/trials) to estimate the error in p. Setting the error to one in this case simply forces that stratum to be split further as it will automatically be the worst existing stratum.
Have fun with MC. That’s all I have planned to do on it, it’s time for some different topics!
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Ian Rayner July 2011 # # Estimating pi using MC simulation - uniform dist'n / stratified sampling # # equation of circle of radius r centered on origin is # x^2 + y^2 = r^2 # So x ~ U(x.o, x.o + x.len) and y ~ U(y.o, y.o + y.len), if x^2 + y^2 > r^2 then # point is outside circle. # # ratio of points inside circle to total points = pi/4 #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Packages / Functions # completes integration and plots points # Note the override on <5% and >95% results (not reliable) # forces these areas to be re-visited mc.est <- function(radius=1, x.o=0, x.len=1, y.o=0, y.len=1, trials=1000, plot.on=T){ x.loc <- x.o + runif(trials, 0, x.len) y.loc &lt- y.o + runif(trials, 0, y.len)< area &lt- x.len * y.len frac &lt- sum((x.loc^2 + y.loc^2) &lt radius^2) / trials err &lt- sqrt(frac * (1 - frac) / trials) if ((x.o^2 + y.o^2 &lt radius^2) & (frac &lt 0.05)) err &lt- 1 if (((x.o + x.len)^2 + (y.o + y.len)^2 &gt radius^2) & (frac &gt 0.95)) err &lt- 1 if (plot.on){ points(x.loc[1:floor(trials/4)], y.loc[1:floor(trials/4)], pch=".") } return(area * c(frac, err)) } # Divides up the space horiz / vert and runs 4 MC integrations # Selects best results and returns them as 2 rows of a matrix stratify &lt- function(radius=1, x.o=0, x.len=1, y.o=0, y.len=1, trials=1000, plot.on=T){ x.o &lt- c(x.o, x.o + x.len / 2, x.o, x.o) x.len &lt- c(rep(x.len / 2, 2), rep(x.len, 2)) y.o &lt- c(rep(y.o, 3), y.o + y.len / 2) y.len &lt- c(rep(y.len, 2), rep(y.len / 2, 2)) strata &lt- matrix(nrow=4, ncol=2) for (i in 1:4){ strata[i, ] &lt- mc.est(radius, x.o[i], x.len[i], y.o[i], y.len[i], trials, plot.on) } index &lt- ifelse((strata[1, 2]^2 + strata[2, 2]^2) &lt (strata[3, 2]^2 + strata[4, 2]^2), 1, 2) index1 &lt- 2 * index - 1 index2 &lt- 2 * index first &lt- c(x.o[index1], x.len[index1], y.o[index1], y.len[index1], strata[index1, 1], strata[index1, 2]) second &lt- c(x.o[index2], x.len[index2], y.o[index2], y.len[index2], strata[index2, 1], strata[index2, 2]) return(rbind(first, second)) } # the script proper ... tim &lt- proc.time()[3] # user settings trials &lt- 500 # number of trials to use in each MC integration error &lt- 0.0001 # required accuracy of integration devs &lt- 2 # converts error to z-score radius &lt- 1 # of the circle x.o &lt- 0 # sets the initial integration space x.len &lt- sqrt(pi/2) y.o &lt- 0 y.len &lt- sqrt(pi/2) plot.on &lt- T # turn on plot sig.figs &lt- 5 # for output results # set up plot space if (plot.on){ dev.new() plot(x.o, y.o, pch=".", xlim=c(x.o, x.len), ylim=c(x.o, y.len), main="Scatter Plot of MC Samples", sub = paste(trials, " Trials, ", 100 * error, "% err @ ", devs, " sd", sep="")) } result.matrix &lt- matrix(nrow=1, ncol=6) result.matrix[1, ] &lt- c(x.o, x.len, y.o, y.len, mc.est(radius, x.o, x.len, y.o, y.len, trials, plot.on)) # loop until required accuracy is exceeded while (devs * sqrt(sum(result.matrix[ , 6]^2)) / sum(result.matrix[ , 5]) &gt error){ stratum &lt- which.max(result.matrix[ , 6]) result.matrix &lt- rbind(result.matrix, stratify(radius, x.o=result.matrix[stratum, 1], x.len=result.matrix[stratum, 2],y.o=result.matrix[stratum, 3],y.len=result.matrix[stratum, 4], trials, plot.on)) result.matrix &lt- result.matrix[-stratum, ] } result &lt- c(sum(result.matrix[,5]), sqrt(sum(result.matrix[,6]^2))) if (plot.on){ x &lt- seq(0,1,by=0.001) y &lt- sqrt(1 - x^2) lines(x, y, col="red") } print(paste("Estimate = ", round(4 * result[1], sig.figs), " ... Actual = ", round(pi, sig.figs), sep="")) print(paste("tgt ", devs, " sd error = ", 100 * error, "% ... est. error = ", 100 * round(devs * result[2] / result[1], sig.figs), "% ... actual = ", 100 * round((4 * result[1] - pi) / result[1] / 4, sig.figs), "%", sep="")) print(paste("Interval: ", round(4 * (result[1] - devs * result[2]), sig.figs), " to ", round(4 * (result[1] + devs * result[2]), sig.figs))) print(proc.time()[3] - tim)
Share
If you found this post informative, please share it!