A Survey Course on Data Smoothers

In this illustration we are going to show you a wide assortment of data smoothing operations.  Each will create a “surrogate” price that will have the same scale as the original data series.  That is, if the Dow Jones Industrial Average is around 11,000 you would expect your smoother to give you a surrogate price in that vicinity.  There are many other ways of smoothing data that result in a different scale.  We will leave those for another illustration.

Each smoother has advantages and disadvantages.  In some cases, the advantages and disadvantages are identical, i.e. the good news is “it’s really smooth” and the bad news is “it’s really smooth”.  Usually there is a tradeoff in that the more you smooth a data series, the more lag you introduce.  For example, the most appropriate way to smooth data of a known periodicity (cyclicality) is to use a moving average of that period.  If you have daily market data that has a cycle of one year, you would best smooth it with a 252-day moving average.  (There are 252 trading days in the typical year.)  Unfortunately your moving average is most applicable when compared with the middle value of that period, or the data 126 days ago.  So your surrogate is appropriate, but has a lag of half a year.  That will not be of much help with a market that is open every day.

In each case, we will mention “days”, but you could apply these smoothers against any dataset.  Thus “days” could also be weeks, months, 5-minute periods, etc. 

When we consider smoothness, we should consider what smooth is being compared to.  In most cases, the non-smooth price or raw price is the close.  There are several reasons why the close is the most important price of the day.  Firstly, it’s the last price of the day, and secondly, it’s the price by which all settlement values are determined.  That is, if you have a margined position, your margin requirement is based on the closing price.  Those of us who do quantitative work will share the importance of the close relative to all other prices.  All other prices lag with regard to providing important signal-generating information.

Intra-day Smoothing:  Midrange, Typical

If you have more than one price for a day, then you can easily smooth the data by taking each day’s midrange.  Alternatively you can average in the closing price to your mix, generating the “typical” price. 

The above chart shows closing data (blue) plotted with midrange data (red).  Both have subsequently been smoothed with a 20-day moving average.  Note that on the turns, the blue tends to turn before the red, which is to say that using the midrange produces a 1-day (or longer) lag. 

Moving Averages

Moving averages are the simplest smoothers.  Before widespread use of PCs, one could easily keep up the data smoothing with a hand calculator, as the only data necessary for the calculation were the price being added and the price being dropped.  That exercise warned users to “Beware the dog that bites twice”, which is to say that the moving average was tremendously affected by not only the new day, but the day being dropped.  That “dog bite” problem can be significant in shorter term moving averages.  Most traders who have always used computers are unaware of dog bite problem.

Moving averages weight each day the same, and can be performed over moving periods of whatever length and segmentation you wish.  By segmentation we mean weekly or monthly, but you could just as easily consider ever-other or every-tenth day.  You should also consider weekly averages with the week ending on a day other than Friday. 

As stated above, moving averages are the best way to smooth data of known periodicity, and certain market data is hugely periodic.  For example, look at the data below:

This is a chart of the open interest of call options on the Chicago Board Options Exchange.  This cyclicality of outstanding options affects option volatility and thereby option prices, and has a feedback effect on the prices of the underlying securities.  As option expiry is monthly, this cyclicality can effectively be eliminated by averaging over the number of trading days in a month (usually 21).  Here’s a chart of the S&P with a 20-day moving average of the close.

Weighted and Exponential Smoothers

Some would argue that the days should not be weighted equally.  After all, isn’t it logical that the most recent data is more important?  For example, instead of weighting each of your 20 days in a month equally, you could weight the first five days at a value of 1, the second five days at a value of 2, the third five at 3, and the most recent five at 4. 

Why stop there?  Instead of having older data decay in importance at a large amount every five days, you could have its importance decay at a continuous exponential rate.  That is, you could have the current day’s data value 20 percent of your smooth surrogate, and the entire prior history value 80 percent of the surrogate.  An exponential smoother (as well as a weighted moving average) eliminates the problem of dropping the older data.  That is, this dog only bites once - when the new data is added.

Here’s what the exponential smoother described above looks like:

As you can see, this smoothing tool tracks better, but changes direction more frequently.  Smoothing can be done repeatedly, each producing smoother data, and introducing further lag.  For example, let’s take that red line above and smooth it again at the same rate, and compare the new doubly-smoothed surrogate (green) to our red line:

But how does this (green) compare with our first attempt using a simple or linear-weighted moving average (orange):

Exponential decay can be further managed such that different decay rates can be dictated to affect different parts of the prior data history.  This procedure is referred to as a 2-stage (or N-stage) exponential.  Managing several rates of decay enables one to construct a much smoother surrogate with less lag than other smoothers.  This process can be done with an infinite number of variations, which we have illustrated elsewhere: http://www.fdcusers.com/Filter.htm

Regression Smoothing: Moving Trends and Moving Parabolas

Suppose you take a 20-day period and run a linear regression line through the data, such as the following:

Then you save the last value of that line, as the “value described by the most recent 20-day trend”.   Then you march ahead one day and do the same calculation for that 20-day period, again saving that last data point.  You thus have a collection of these moving linear regression values, or moving trends.  A good illustration of the moving trend at work identifying market tops and bottoms can be found at: http://www.fdcusers.com/FDC%20identifies%20tops%20and%20bottoms.htm

The moving trend is a much different surrogate than what we have previously illustrated.  This operation does not merely describe the past, but has predictive capability.  Each trend calculation produces a formula which can then be used to forecast data that would lie on an extension of that trendline.

Now, much of economic data is linear, and linear data would be best fit with a linear regression line.  But what if the data is not linear, like most markets?  Such data would be best fit with a parabolic trendline.  In that case you would describe the best parabola to fit the data, and save the last value of that parabola. 

With a good eye you can pick out the difference between the moving trend and the moving parabola.  But let’s make it easy for you: Here we show the linear fit in dark purple, with the parabolic fit in light blue.  For this data, the parabolic fit is certainly smoother and has less lag.  See: http://www.fdcusers.com/Moving%20Parabolic%20Fits.htm

Moving parabolas also come with formulae which can be used to extend the parabolas.  Here’s what happens when you calculate 20-day moving parabolas and extend them into the future by 2 days (the red line) and compare them with the parabola extended backward in time by 5 days (green line).  Thus you have a leading indicator plotted against a laggard.

Non-Linear Smoothing Operations

Moving Medians

Instead of taking an N-day moving average (also referred to as a moving mean), you could take an N-day moving median.  The median is simply the middle value, or the value such that fifty percent of the data is above and fifty percent below.  Whereas a moving average considers all values, medians have the distinct advantage of discarding outlying values.  This advantage is particularly evident when operating on smaller strings of data.  For example, the 5-day moving average is particularly vulnerable to the “dog who bites twice”, as described in the moving averages section, whereas the 5-day moving median rarely gets bitten. 

Point & Figure

Most people only know Point & Figure as charts made up of x’s and o’s, where the reference to time has been eliminated.  Few understand P&F for what it really is:  an adaptive, asymmetric non-linear smoothing tool.  Note that one cannot make the little x’s and o’s until he has first determined where and when the next box is added.  Instead of adding a box, he only has to connect a line to where that box would be located.  We have written articles on how this is done, and it would be unnecessarily repetitive to cover that same territory here.

In many ways, P&F is the opposite of the moving median.  Whereas the median discriminates against an abrupt move, P&F immediately recognizes moves beyond a certain size and ignores periods of inactivity. 

Although P&F data is the result of a filtering process, the output is jagged rather than smooth.  It is nonetheless a very interesting process that creates surrogates for market data.  Here’s a chart showing one application of P&F, extended to accommodate a daily chart.

One of the major advantages of P&F is that it discriminates in favor of a trend.  Of course, once you begin viewing P&F data as a line, it is quite easy to apply other smoothing tools to it, or use it as input for a neural network. 

Range Bars

Instead of looking at the S&P as data broken up according to its daily (i.e. time delineated) occurrence, or according to movements of “at least so-many” points (as in Point & Figure charts), some traders like to break the data up into movements of a specific height.  For example, if you considered a 10-point move in the S&P as your measuring device, then you would create a bar of data every time the closing price moved by 10 points.  You could also have as your input the price of the S&P every minute.  The input can be any data consisting of a time series of only one price (such as the last sale) and the output is bar data consisting of four prices (open, high, low and close).  The close of the resulting bar will always be at one of the extremes (high or low). 

In some cases, depending on the bar height chosen, there will be several bars for each date, and at other times there will be one bar representing several dates.  The data will be dated as of the last date.  This form of presenting the data essentially renders time irrelevant and discriminates against periods of inactivity.  In that regard, Range bars function similarly to Point & Figure.

Below is a comparison plotted on the same time scale.  That is, the Range Bars below are identical to those in the above chart.  By plotting them with the normal time series data you can see how they discriminate against periods of inactivity.

The charting of data in the form of Range Bars is not the “end game”.  As with everything else illustrated, you should view the assembly of data in Range Bar form as merely the precursor to further analysis. 

Cyclic Fits

One of the best ways to create a surrogate is to analyze the data series for cyclic behavior, find the best fitting cycles and put them together.  The magenta line below is a composite of the 10 most dominant cycles that can be obtained from the dataset. 

This cyclic behavior is an excellent way to fit past data.  Furthermore, since one has the formulae for their construction, those formulae can be used to produce a cyclic prediction.   However the methodology has several problems.  Chiefly, although it can be used to generate a prediction, that prediction has no reliability.  Many other smoothers are not good predictors either; it’s just that most people believe that the past cycles will be repetitive.  The cyclic values are also not stable over time.  That is, if you cyclically analyze N days through the most recent Monday, and then analyze an identical number of N days through Tuesday, the entire fit of Tuesday values will be different from the entire fit of Monday values.  This is akin to the dog bite problem discussed above, except it is worse:  With all of the prior examples, the past values never change.  With cyclic approximation, all the past values change.  So be warned.

Further Topics

Covering every type of smoother or combination is not possible in what is essentially an introductory or survey article.  Suffice it to say that anything that can be done on a daily basis can also be done at different time scales.  For example, you easily look at weekly price series or the averages of the data within each week (or monthly or quarterly).  Going the other way, you could perform the same operations on hourly data or minute data.  Should you be concerned that looking in such detail gives you too much noise, then use Point & Figure or Range Bars to eliminate some of the periods.

You may also wish to look at http://www.mathinvestdecisions.com/jurikpackage.htm, which describes all of the non-linear smoothers created by Mark Jurik.  On that page is also a description of using Jurik’s Composite Fractal Behavior indicator to adaptively modify a moving average.  In that way you have the market’s behavior dictate the length of the look-back period of your moving average smoother.  That is merely but one example, as any of the above-illustrated smoothers can take adaptive parameters.

Separately we have illustrated the decomposition of data using wavelet technology, and the recomposition of that output to provide faster (but still accurate) market signals.  That description can be found on our User’s site at http://www.fdcusers.com/Wavelets.htm.

We have not illustrated phase-shifting or “goosing” of data.  If you take a smooth surrogate for your data (like any of the above) and add to it a measure of momentum of the same data (such as an oscillator), you will get a smooth, but accelerated, surrogate for your data.  Further discussion belongs in a paper on oscillators.

The All-Important Question:

Which of the above is best?  Naturally, that depends on what you want to use it for.  Essentially the study of smoothers is the creation of surrogates.  In creating a surrogate your aim is to have a dataset with advantages not in your original, and minus the disadvantages of your original.  Thus, to create the best surrogate, simply identify the advantages and disadvantages of your original data, and use the smoother (or combination) that correctly deals with those attributes.  In some cases, you need to go high-tech, whereas at other times a simple moving average is most appropriate.

---------------------------------