This is the seventh in my Hedge Fund Hacks series. It is a natural follow-up to my sixth hack on Hedge Fund Return Predictability in which I identified the following conundrum:

- You need a track record of 8+ years of monthly data to have reasonable confidence in a manager’s expected returns.
- The longer the track record you demand, the fewer managers you will have to choose from.
- A long track record will not be a stationary series: Both manager and markets will have evolved!

In this post I explore if we can substitute more managers for longer track records and solve the conundrum.

## The Data

I work with data from the Eurekahedge Database, my go-to source for hedge fund data. First I clean the data using my data hygiene strategy. Then I select only managers with at least a 20 year track record, and a Sharpe Ratio less than 2. This leaves 212 programs. Each track record is normalized to 20% annual volatility to make them comparable on a risk-adjusted basis.

The charts to the right give a sense of the entire population. They show the distribution of monthly returns and the Sharpe Ratios of all 212 programs.

## Longer Track Records Give Better Performance Estimates

First let’s revisit the results from the previous post.

I assume that a 20+ year track record allows us to make a reliable estimate of the manager’s true ability. Imagine we have only a small sample from that track record for estimating future returns. I explore how good or bad an estimate we get depending upon the size of the sample. The chart below shows the results for four different sample sizes (each double the previous sample size, starting with 12 months).

##### Detailed Methodology - Track Record Experiment

For all 212 programs, for multiple sample sizes (12, 24, 48, and 96), I complete the following process:

- Take 10,000 random samples of a given size from the 20+ year track record.
- For each sample, say 24 months, I estimate the program’s return and volatility.
- I calculate what return I would have got over the 20+ years, had I allocated to this program with a target 20% annualized volatility.
- I calculate the shortfall or excess return vs. my expected value.

The results are plotted in the histograms above.

This approach is somewhat different from what I did in my prior post. I believe I am now more accurately testing what would happen in the real world.

The difference is in the way sample volatility is handled. In this case, if the sample volatility comes in low relative to the population, I will end up over-allocating to the program. This means that if the sample average return (i.e. what I expect to get in the future) is high relative to the population mean (what I actually get), then I will get a larger negative surprise than in the previous experiment.

The impact is small, though the distributions of the errors are much more symmetrical vs. the somewhat left-skewed distributions I got in Hedge Fund Hack # 6.

### Discussion

The errors in the estimates of monthly returns are shockingly large! The average return across all the months in all the programs was 1.25%. So in all cases, the average negative surprise is similar to the value we are trying to estimate. If you had relied on a 12 month sample as your estimate of expected returns, 1 in 5 managers would have delivered long-term performance 1.5% or worse per month below your expectations. Even with a 96 month sample, 1 in 5 managers under-perform vs. expectations by 0.5% or worse per month!

You need a large sample (i.e. a long track record) before you can estimate future monthly returns with any confidence. To a first approximation the accuracy of the estimate improves with the square root of the sample size (this should not come as a surprise as variance decreases inversely with sample size).

## Shorter Track Records – More Managers to Choose From

Part of the conundrum outlined in the introduction is that if you want long track records then you have fewer managers to choose from. We can illustrate this effect by looking at lengths of the track records of the managers reporting in the Eurekahedge database as of December 2017. I have not applied any filtering to this set other than to use unique programs only.

The adjacent chart illustrates the number of programs with a track record greater than or equal to the value on the x-axis. I have added a number of points to match up with the 12, 24, 48, and 96 month track records used in the previous charts. I have also included the round numbers 5, 10 and 20 years.

Going from a 2 year track record to 5 years, we lose about a third of the programs. We lose another half going from 5 years to 10. So the effect is quite large.

## Multiple Managers vs A Single Manager: More Predictable Returns

My preferred measure of predictability in this experiment is the average negative surprise. The definition sounds like something Yogi Berra came up with: it is the average difference between what we expect and what we get, given that what we get is less than what we expect!

You can see from the four panel chart above that by going from a 24 month sample to a 96 month sample, we improved the predictability of returns. The average negative surprise went from -1% to -0.5%: we reduced the average shortfall by a half percentage point per month.

We are going to use this as our benchmark: How many managers do we need in a portfolio to get a similar improvement while using only a 24 month sample?

I follow more or less the same approach as before, working with managers with at least 240 month track records. However, in this case, I create random portfolios of managers. Based on a sample of 24 months, I calculate the expected return of the portfolio. Finally, I see how well the expected return predicts the actual return. I explore the effect of varying the number of managers in the portfolio.

Here are the results:

##### Detailed Methodology - Portfolio Size Experiment

For all 212 programs, for multiple portfolio sizes (1, 2, 4, and 8 programs), I complete the following process:

- Create 1,000 portfolios by randomly picking from 212 programs without replacement.
- Select the range of months for which all members of the portfolio have returns (this results in a minimum 210 consecutive months from which to sample in the next step).
- 1,000 times for each portfolio, pick 24 months at random with replacement.
- Figure the sample returns and volatilities for each portfolio member
- Weight the portfolio members so they each contribute equally to 20% portfolio volatility.
- Use the weights to calculate the expected portfolio return.
- Figure the “actual” return of the portfolio using the weighted long-run return of each program.
- Calculate the error in the estimated return by subtraction.

I experimented with using various combinations of numbers of portfolios and numbers of trials and found the results to be consistent. Since the number of possible portfolios is orders of magnitude larger than the numbers of possible samples, one could certainly argue for more portfolios and fewer samples from each.

### Results and Discussion

A portfolio of 8 managers, each with 24 months of data, results in a prediction of future performance about as reliable as having 1 manager and 96 months (or 8 years) of data. Put another way, I had to use 8 times as many managers to get the effect of 4 times the track record! Or, yet another way, to halve the error took 8 times the managers.

To a first approximation, it appears the reliability improves with the cube root of the number of managers. Let’s do a quick back of the envelope calculation. Say you go from requiring a 60 month track record to 36 months. Worst case scenario you will need (60 / 36)^3 = 4.6 times as many managers as you currently have. That will give you about the same confidence level in your expected returns.

My gut was expecting I should get the same effect with 4 times as many managers. Apparently, my gut under-estimated the effect of correlation. In the sample data, managers tend to be synchronized in their over / under performance vs. their long-run “true” performance!

You should be asking yourself if this effect will be even worse if you are selecting managers from the same strategy group such as long-short equity.

The implication, based on this data at least, is that as you reduce the minimum acceptable track record, you have to add a lot of managers to maintain the same level of confidence in your expected return. There are, of course, significant diversification benefits for your portfolio.

## Effect of Correlation

I thought it instructive to break the correlation effect I hypothesized above to see how large it was. I repeated the experiment with a small but important difference. Instead of using the same randomly sampled months for all the members of the portfolio, I used a different random set of months for each member. So, if there was a correlation effect, it would be removed.

The results were as follows:

### Results and Discussion

Amazingly enough, we now only need to go to 4 programs to get the target improvement in confidence rather than 8 we needed when correlation was allowed to work its effect. Without any correlation, the error decreases with the square root of the number of managers.

This experiment gives you an idea of the “cost” of correlation. It is in the extra managers we need to hire to get more reliable performance predictions. By the way, here’s a great post on non-stationarity of correlation.

## Conclusions

I am not directly addressing diversification in this post. Rather, I am exploring the confidence we have in performance statistics. I have used return as an example, but the concept applies equally to volatility, correlation, etc. We need to think through this issue BEFORE we get to diversification. We can’t optimize a portfolio until we have confidence in the performance characteristics of its components. Diversification is the ultimate objective.

There is only one way to improve confidence – more data. When analyzing managers for a hedge fund portfolio we have two ways of getting more data:

- Use longer track records.
- Group multiple managers to represent a particular strategy.

Longer track records increase the confidence with the square root of their length – 4 times the track record halves the error. The downsides are having to figure out how much of the track record is currently meaningful and having fewer managers to choose from.

Based on the limited set of data I used, grouping sets of managers together appears to increase confidence with the cube root of the number of managers – 8 times the managers halves the error. Allowing shorter track records gives greater choice of managers and a greater liklihood their track records are still relevant. The downside is having a lot more due diligence per dollar allocated.

When you build your portfolio, the components become sets of managers grouped by strategy rather than individual managers. This brings the collateral benefit of enhancing the portfolio’s diversification.

If you don’t have sufficient capital to allocate to a lot of managers, consider allocating to Fund of Funds. While there may be extra costs (debatable) the benefit of lower volatility, lower drawdown, and greater confidence in future returns may be worth it.

Photo credit: Photo by Natalie Rhea Riggs on Unsplash

## Share

If you found this post informative, please share it!