Cross-correlation Matrix: CROSSCOR

There are many possible applications for this handy little “shortcut” tool.  Let us just illustrate the application that caused us to create it.  We have used it to choose assets for sector rotation, but it can also be effectively used to choose among proprietary funds.

We have found diversification to be a major help in minimizing the volatility of investment returns.  Not only do we diversify assets and asset classes, but also investment strategies.   One of those strategies is sector rotation of Exchange Traded Funds (ETFs), which we have found can generate compound annual rates of return enviable by all of Wall Street. 

Before you can choose the sector (or ETF) in which to invest, you have to select your population of sectors to consider.  Your first cut should be made on the basis of liquidity, as there’s no sense considering an investment in an asset that cannot be effectively traded.  But after that cut, then what?

In all likelihood some of the sectors are redundant.  You are happy to be invested in several related sectors if they are all screaming to new heights, but the problem comes when they all turn down together.  Thus the purpose of this exercise is to reduce the redundancy in hopes of reducing drawdowns of the portfolio.

Initially we selected the following sectors, solely because they “seemed good”:

Small Caps                               Brazil                            Biotech

Emerging Markets                    Hong Kong                  Retail

Consumer Staples                     Japan                           Energy

Pharmaceuticals                        Taiwan                         Materials

Info Technology                        Financials                     Industrials

Semiconductors                        Health Care                  Utilities

Government Bonds (10-year)

First we have to assemble a dataset with each column being a different asset.  If the number of assets is few, you can easily do this in a command line.  Language such as “SectorA  common  SectorB  common  SectorC” will work well.  Remember to use the command “common” rather than “with”.  The former will assemble the data columns for only the dates they have in common.  Should you have a large number of assets/sectors, use the method demonstrated on the first page of our Horizontal Sorting illustration (http://www.mathinvestdecisions.com/Horizontal%20Sorting.htm).  

We could check their cross-correlations on any bases we wished.  However since our goal is to reduce sectors with redundant market “signals”, let us check the cross-correlation of those signals.  This is especially appropriate since our signal or main decision rule is a trend/momentum indicator.  Thus we next need to apply to each item in our dataset that main decision rule.

In the case of our sector rotation program, we pick the sector with the highest 63-day (quarterly) moving slope rate of change.  Therefore we need to check the cross-correlations after that operation:

                ** N.B. The text of all commands is located at the end of this document.  That way you can               simply copy them over to your own command set, should you wish to duplicate or modify the research

When you enter this command, FDC calculates the overall correlation of each column with each other column, saves all of the information in an Excel® spreadsheet named, and opens that spreadsheet on your desktop.  The file will be located in the excel subdirectory of the fdc directory.

Each new calculation will overwrite the previous crosscor.csv file without further warning.  This can be prevented by using the optional left argement to provide a new name or path for the created .csv file. For example, if the above command is modified as follows:

'myname' crosscor 63 msroc mixed

Then the .csv file will be named 'myname.csv' and it will be placed in the excel subdirectory of the fdc directory, as before. If the modification is:

'C:\folder1\myname' crosscor 63 msroc mixed

then the .csv file will be named 'myname.csv' and it will be placed in the 'C:\folder' directory.

At this point we need to have the human enter the selection process.  Before, we need to give you a word of caution.  Ideally you would develop your program on data from period “A”, and then test it on data from period “B”, in which neither period has common dates with the other.  Both periods should contain examples of different types of market conditions.   In such ideal circumstances you would most certainly also consider the performance of the sectors during period A. 

But what do you do if those ideal conditions do not exist; say if you do not have enough data?  That is, your development and testing periods are the same.  How do you avoid curve-fitting?  You do so by being blind to performance information when you are thinning out your sectors.  The process will not be ideal, but it will still be valid.

Select the area of the correlations, as shown:

Then go to , and you will be presented with:

Enter the values shown here (.90 to .999), then click on , and choose the Patterns tab: , pick a color  and ., and then  again. 

You will then see:

Column 11 and Column 15 have too high a correlation for both to be included.  But which one do you drop from your population?  That’s a judgment call, and we are going to give you one way of solving it. 

Go back to Conditional Formatting, this time choosing values between .5 and .999, and obtaining the following:

We cannot quite make the decision as to Columns 11 and 15, but we can certainly decide that columns 5, 6 and 19 deserve to be retained.  However, columns 1, 3, 12 and 14 all have 7 orange boxes each.  Let’s first delete columns 1 and 3.  Remember to also remove rows 1 and 3:

Columns 12 and 14 each have 5 colored boxes; but the value of those in Column 12 are higher, so let’s eliminate those.  Then we can see that between Columns 11 and 15, that perhaps we should delete Column 11.  Column 13 also has some high values, so we eliminate that also. 

Column 14 is still showing four boxes, and Column 10 has the largest value still remaining, so we delete both of those:

We are now down to 12 assets, none of which shows more than one orange box.  The deselecting process shown here is of course somewhat subjective, but valid.  In our case, we had originally started with what we believed to be a good group.  This procedure enabled us to get a better group. 

Your next step is to rename your asset mix:

 and SAVE AS, or

.

Then do your sector rotation testing on the dataset “thinned_sectors”.

-------------------------------------------

crosscor 63 msroc mixed_sectors

mixed_sectors cols 2 4 5 6 7 8 9 15 16 17 18 19

thinned_sectors gets mixed_sectors cols 2 4 5 6 7 8 9 15 16 17 18 19