Adaptive Predictive Trading Systems Steven C. Rogers, Member, EE-Pub, Leon Luxemburg, Matt McMahon Published: February 22, 2005
Abstract Forecasting or predictions of financial instruments have long been of great interest to practitioners and researchers. Many powerful Digital Signal Processing (DSP) filtering techniques have been developed to support prediction. Predictive filters lend themselves very nicely to the development of trading decision-making. The techniques in this paper have application to high frequency (intra day) trading as well as long-term trades. Some linear and nonlinear concepts will be applied and compared in this paper using matlab for implementation. These approaches are tested using various stocks.
Article Information Field of Study—DSP Keywords—estimation adaptive-filters multilayer-Perceptron Prediction trading-systems financial-trending
I. INTRODUCTION Digital signal processing has long been used for time series analyses. Prediction is a subset of signal processing. Prediction requires an assumed system model of the signal to be processed. Once such a model exists, then it may be manipulated to estimate future output values. If a model is large and sophisticated the computer throughput requirements are more significant. Since the advent of more powerful computers prediction has increasingly been used in real-time applications, such as semi-automated trading systems. Linear model structures have been used for prediction, since they are simple, easy to implement, and easy to analyze. However, linear models suffer from lack of accuracy, which will give a bias due to model errors and unmodeled dynamics. Nonlinear model structures, such as neural networks, have become a viable alternative to linear models. In this paper adaptive linear and nonlinear predictive systems are presented and compared to fixed observers. These are developed into trading systems and tested on different stock price data.
II. PREDICTION FILTERS Prediction is the forecasting side of information processing. The motivation for using a predictive system is to derive information about the future characteristics of a signal based on current and past information. A typical adaptive predictive filter structure1-6 is given in figure 1.
Figure 1 Adaptive Predictive Filter Architecture In figure 1 the problem is to predict the present value based on data n time steps in the past. The current estimate is then a smoothed version of the input data. If it is desired to estimate future values of the input data the input to the adaptive filter may be replaced with more current data. An adaptive filter has its weights adjusted regularly to force minimization of some performance criteria, usually deviation from the input signal. The adaptive filter may be linear or nonlinear. It may be feed forward or recurrent (feedback). Feedback filters provide for additional dynamics in the architecture. The simplest structure is a finite impulse response (FIR) structure. It is also called moving average (MA) and has all zeros. It follows the equation: yk = a0*xk-1 + a1*xk-2 + ‌ + an-1*xk-n, where yk is the output of the filter, x is the input data stream, ai is the coefficient associated with the ith input value, and n is the length of the filter. The coefficient vector is updated based usually on deviation from the current value of the input data stream or xk. A typical update law is:
is frequently modified to account for signal variance or noise power, such as,
Passing it through a low pass filter may smooth the variance. Another linear model structure is the infinite impulse response filter (IIR), which contains both poles and zeros. It is also called autoregressive-moving average (ARMA). This incorporates feedback into the output. The equation is: yk = b1*yk-1 + b2*yk-2 + ‌ + bm*yk-m + a0*xk-1 + a1*xk-2 + ‌ + an-1*xk-n, where m is the order of the numerator of the
ARMA transfer function and b is the coefficient vector associated with previous filter output values. The update law is similar to the above FIR law, but includes the feedback values as well. Other alternatives to the above linear adaptive filters include multiplayer perceptrons7-10 (MLP). Generic components of neural networks are shown in figures 2 and 3.
Figure 2 Neural Network Components
Figure 3 Structure of a Neural Network The MLP is one of the most commonly used neural network architectures and is shown in figure 4. The output equation for a feedforward single hidden layer MLP8 may be written: O = W*f(V*X), where O is the output vector, W is the output matrix, X is the input vector, V is the input matrix, and f(..) is generally a squashing function selected from the examples shown in figure 3. The matrices W and V contain the adjustable parameters. A matlab code segment showing the adaptive update law for an MLP is shown in figure 5 below. In this case the squashing function is a hyperbolic tangent. Note that G is the output of the single hidden layer, dW and dV represent the momentum contribution, out is the MLP output, and mu and bet are adaptation gains. This adaptive update law is slightly modified from the backpropagation law to include the momentum component. Thus, the matrix weights are updated with each new input data sample and are constantly being tuned in a least squared sense.
Figure 4 MLP Architecture
Figure 5 Matlab code segment of the MLP adaptive update law The code segment in figure 5 may be easily modified to a recurrent network. Recurrent means that feedback or node output is returned and added to the input vector. The recurrent code segment is shown in figure 6. Recurrency is advantageous as it incorporates more dynamics in the neural network. Note that the only modification is in the last line and is to update the input vector to include the full output of the single hidden layer. If it were desired to update feedback from additional sources or a subset of the single hidden layer the same basic structure would be used with G replaced by the appropriate terms.
Figure 6 Matlab code segment of the recurrent MLP adaptive update law Once we have the adapted model, we have a means to predict the future outputs. In our case we will predict a single step ahead, so we run through the model again. The MLP prediction code segment is shown below in figure 7.
Figure 7 Code segment for step ahead prediction Now that we have step ahead predictors the next step is to incorporate them into profitable trading systems. The simplest approach is to create a slower comparison signal by passing the predictions through a 1st order low pass filter. This provides a slower signal. Signal crossings will trigger buy/sell indications. This simple but effective approach is shown in the matlab code segment in figure 8. Note that pred is the step ahead prediction, predslow is the slower filtered signal, and alph is the discrete 1st order filter pole. A buy signal is given when the value of pred > predslow. A sell signal occurs when the value of pred < predslow. Since the pred signal reacts more quickly to changes than the slower time constant predslow, it more closely represents the current price trend.
Figure 8 Code segment for buy/sell signals For comparison 2 scheme state estimation observer-trading systems are developed. These are fixed parameter systems that are guaranteed to be stable. EST4 has 4 states used to estimate the current price and EST3 has 3 states to estimate the current price. Figure 9 shows the matlab code for EST4.
Figure 9 EST4 matlab code The 4 states include the price estimate, the rate, the acceleration, and the jerk, which is the derivative of the acceleration. The observer is set up inside the ‘if’ loop. The Kalman gain L is designed by pole placement by using the matlab command ‘place’. Ts is the time step in seconds/trading day. The poles are placed reasonably fast so that they have a significantly smaller time constant than the data being tracked. Consequently, good tracking results. EST3 is similar, except the jerk state is omitted. They were compared to determine the impact of additional states on trading system performance.
III. RESULTS The above-described methods were compared against each other based on return as defined by
where P(i) is the current price and P(i-1) is the previous price. R(i) values were summed only during the buy signal periods. This approach normalizes each price for comparison purposes. Since this study is to compare algorithms an ideal transaction is assumed, i.e., transaction delay or transaction costs are not included. Daily prices for the stocks given in Table 1 for a period of ~ 2 years were chosen for evaluation. Table 1 shows the returns
for the 5 approaches. The table values are computed as: If buy then tablei = sum(R(i)). Percent return on investment is obtained by multiplying the table values by 100%.
Table 1. Return sums in percent for the approaches The approaches rank consistently as: 1) MLP, 2) recurrent MLP (RMLP), 3) IIR, 4) EST4, 5) FIR, and 6) EST3. The RMLP is slightly ahead of MLP and they both consistently outperform the rest. EST4 and the adaptive FIR are close to each other in performance. EST4 and EST3 are fixed non-predictive state estimation observer schemes. EST4 has 4 states and EST3 has 3 states. The adaptive predictive schemes generally outperform the fixed non-predictive schemes. The following figures show the buy-sell comparisons for some of the stocks.
IV. CONCLUSIONS A number of alternative trading systems have been developed and compared. Also, a simple observer based approach has been presented. Matlab code has been provided as well as simulation results for the various methods. Of the 6 approaches considered, the MLP and recurrent MLP appear to be superior. They have been examined with stock price data having different characteristics and have performed well in all cases. More testing and evaluation is necessary before definitive performance conclusions may be made, however, there is sufficient basis for further testing.
V. ABOUT THE AUTHOR Steven C. Rogers is with the Institute for Scientific Research, Fairmont, WV26554, USA, srogers@isr.us. Leon Luxemburg is with the Institute for Scientific Research, Fairmont, WV26554, USA, lluxemburg@isr.us.. Matt McMahon is with the Institute for Scientific Research, Fairmont, WV26554, USA, mmcmahon@isr.us
VI. REFERENCES [1] Gelb, A. Applied Optimal Estimation, MIT Press, 1974, ISBN 0-262-57048-3 [2] Juang, Jer-Nan, Applied System Identification, Prentice Hall, 1994, ISBN 0-13-079211-X [3] Kailath, T,. etal, Linear Estimation, Prentice Hall, 2000, ISBN 0130224642
[4] H. Baher, Analog & Digital Signal Processing (John Wiley & Sons, 1990), ISBN 0-471-92342-7. [5] R.E. Ziemer & W.H. Tranter, Principles of Communications, 4th Edition (John Wiley & Sons, 1995), ISBN 0-471-12496-6. [6] K. Shenoi, Digital Signal Processing in Telecommunications Processing (Englewood Cliffs, NJ: PrenticeHall, 1995), ISBN 0-13-096751-3. [7] Haykin, Simon, Neural Network – A Comprehensive Foundation, 2 nd Edition, Prentice Hall, 1999, ISBN 0-13-272250-1 [8] Herbrich, etal, ‘Neural Networks in Economics: Background, Applications and New Developments,’ http://stat.cs.tu-berlin.de/publications [9] Goonatilake, S. etal (editors), Intelligent Systems for Finance and Business, Wiley, 1996, ISBN 0-47194404-1 [10] Beltratti, A., etal, Neural Networks for Economic and Financial Modeling, Thomson Computer Press, 1996, ISBN 1-85032-16908
Figure 10 Exxon buy-sell comparisons
Figure 11 Boeing buy-sell comparisons
Figure 12 Raytheon buy-sell comparisons
Figure 13 Walgreen buy-sell comparisons
Figure 14 Pepsico buy-sell comparisons
Figure 15 Monster buy-sell comparisons
Figure 16 Continental Airlines buy-sell comparisons
Figure 17 Fifth Third Bancorp buy-sell comparisons
VII. APPENDIX The following matlab code is the authorâ&#x20AC;&#x2122;s interpretation and is included as a study aid only. The complete matlab code for the six trade systems is given in entirety to show how components interface.