I completed the following 5-Fold Cross-Validation study to illustrate my previous post: Cross-Validation.

## Base Case

Using TradingBlox Builder I took a simple dual moving average system and held the fast moving average at 10 days. I disabled stops, so the only variables were the position size and the slow moving average. A unit of position size is based on the 39 day Average True Range (ATR). The system is always in the market. Cash earns 0% interest, I used 8% slippage on entry, exit and both sides of the roll transactions. I require a minimum 30 day avg volume of 1000 contracts, and no position can exceed 25% of that volume.

I sought to optimize the position size and the slow moving average using “robust open MAR” as my objective function. I define robust open MAR, rOpenMAR, as the Maximum Total / Open Equity Drawdown (TEDD) during the simulation divided by the regressed annual return (RAR).

I created 5 pairs of portfolios: Fold1Train, Fold1Validate, etc. The portfolios were constructed by using R’s “sample(1:39)” to randomly sort 39 integers. Fold1Validate was created by taking the first 8 digits in the random series of integers and forming a portfolio from my regular 39 future portfolio sorted alphabetically. The remaining 31 were used to create Fold1Train. The next 8 digits were used to construct Fold2Validate and Fold2Train. The last validation portfolio, Fold5Validate, contained only 7 futures while Fold5Train contained 32.

I selected each of the FoldnTrain portfolios. I searched the slow moving average over 60 days to 250 days in increments of 10. I searched the position size from 0.01% of equity to 0.2% of equity in steps of 0.01%. I selected the parameter set with the highest rOpenMAR value, recording that value and the RAR and TEDD. A typical result looked as follows:

## 5-Fold Cross-Validation

I then ran the optimal parameters against the validation fold: FoldnValidate with position size scaled up by a factor 4 (see below). I recorded the rOpenMAR, RAR and TEDD. A summary of the results for all five folds is as follows:

Objective Function: rOpenMAR | ||||||||
---|---|---|---|---|---|---|---|---|

Fold | Parameters | Train | Validate | |||||

LMA | Risk% | rOpenMAR | RAR | TEDD | rOpenMAR | RAR | TEDD | |

1 | 210 | 0.16% | 0.567 | 17.85% | 31.50% | 0.021 | 1.64% | 76.50% |

2 | 210 | 0.12% | 0.420 | 11.59% | 27.60% | 0.318 | 9.89% | 31.10% |

3 | 200 | 0.13% | 0.375 | 12.77% | 34.10% | 0.232 | 8.90% | 38.30% |

4 | 130 | 0.13% | 0.424 | 12.35% | 29.10% | 0.309 | 11.39% | 40.90% |

5 | 210 | 0.12% | 0.446 | 13.51% | 30.30% | 0.206 | 6.62% | 32.10% |

Averages: | 0.446 | 13.61% | 30.52% | 0.217 | 7.69% | 43.78% |

A number of important issues arose during this process:

- A possible stratification strategy might involve dividing up the total portfolio into groups with low correlation preserving some of the diversification present in a larger portfolio e.g. put the currency futures in one group and randomly pick one currency plus 7 other futures to make the validation folds.
- Since the validation portfolio is approximately 1/4 the size of the training portfolio, it is clear that position size must be adjusted somehow. The obvious starting point is to use a factor 4. A more sophisticated solution would involve scaling by volatility of the portfolio.
- The chart above shows two obvious local maxima for rOpenMAR – one in the 210DMA and one in the 130DMA area. Inspection of the table above shows that Fold 4 found a global maximum in the 130DMA region while all other folds were in the 210DMA region. In the case of Fold 4, one could argue (using “apples to apples comparison”) that the local maximum in the 210 DMA region should have been selected (see image below) instead of the global maximum. This would have favorably impacted the validation dataset results.

Under the experimental conditions presented, the RAR has dropped by about 40-50% while the TEDD has increased 40-50%. This is not terrible, but not particularly good. Given that this is a long-term trend following strategy working on only 7-8 contracts in the validation folds, it is about as good as can be expected.

At some point I will explore methods for partitioning the price series across time as well as by instrument. In the meantime, hopefully, this post demonstrates the basics of how to implement a K-Fold validation.

2010-12-14: Edited to add final image and improve accuracy of associated note, added info on position size in 2nd paragraph.

## Share

If you found this post informative, please share it!