Hi all, for this post I will be building a simple moving average crossover trading strategy backtest in Python, using the S&P500 as the market to test on.
A simple moving average cross over strategy is possibly one of, if not the, simplest example of a rules based trading strategy using technical indicators so I thought this would be a good example for those learning Python; try to keep it as simple as possible and build up from there.
So as always when using Python for finacial data related shenanigans, it’s time to import our required modules:
import pandas as pd import numpy as np from pandas_datareader import data
We will first use the pandas-datareader functionality to download the price data from the first trading day in 2000, until today, for the S&P500 from Yahoo Finance as follows:
sp500 = data.DataReader('^GSPC', 'yahoo',start='1/1/2000')
Ok, lets do a quick check to see what format the data has been pulled down in.
Good stuff, so let’s create a quick plot of the closing prices to see how the S&P has performed over the period.
The trend strategy we want to implement is based on the crossover of two simple moving averages; the 2 months (42 trading days) and 1 year (252 trading days) moving averages.
Our first step is to create the moving average values and simultaneously append them to new columns in our existing sp500 DataFrame.
sp500['42d'] = np.round(sp500['Close'].rolling(window=42).mean(),2) sp500['252d'] = np.round(sp500['Close'].rolling(window=252).mean(),2)
The above code both creates the series and automatically adds them to our DataFrame. We can see this as follows (I use the ‘.tail’ call here as the moving averages don’t actually hold values until day 42 and day 252 so wil just show up as ‘NaN’ in a ‘.head’ call):
And here we see that indeed the moving average columns have been correctly added.
Now let’s go ahead and plot the closing prices and moving averages together on the same chart.
Our basic data set is pretty much complete now, with all that’s really left to do is devise a rule to generate our trading signals.
We will have 3 basic states/rules:
1) Buy Signal (go long) – the 42d moving average is for the first time X points above the 252d tend.
2) Park in Cash – no position.
3) Sell Signal (go short) – the 42d moving average is for the first time X points below the 252d trend.
The first step in creating these signals is to add a new column to the DataFrame which is just the difference between the two moving averages:
sp500['42-252'] = sp500['42d'] - sp500['252d']
The next step is to formalise the signals by adding a further column which we will call Stance. We also set our signal threshold ‘X’ to 50 (this is somewhat arbitrary and can be optimised at some point)
X = 50 sp500['Stance'] = np.where(sp500['42-252'] > X, 1, 0) sp500['Stance'] = np.where(sp500['42-252'] < -X, -1, sp500['Stance']) sp500['Stance'].value_counts()
(n.b. there was an error in logic with the above lines of code when this post article was posted – so you will very possibly get significantly different results even if using the same inputs and time period of data as I have – the error was that I had omitted the minus sign in front of the “X” in the second line of code in the above code box – the error was kindly pointed out by Theodore in the comments section on 07/03/2019)
The last line of code above produces:
-1 2077 1 1865 0 251 Name: Stance, dtype: int64
Showing that during the time period we have chosen to backtest, on 2077 trading dates the 42d moving average lies more than 50 points below the 252d moving average, and on 1865 the 42d moving average lies more than 50 points above the 252d moving average.
A quick plot shows a visual representation of this ‘Stance’. I have set the ‘ylim’ (which is the y axis limits) to just above 1 and just below -1 so we can actually see the horizontal parts of the line.
Everything is now in place to test our investment strategy based upon the signals we have generated. In this instance we assume for simplicity that the S&P500 index can be bought or sold directly and that there are no transaction costs. In reality we would need to gain exposure to the index through ETFs, index funds or futures on the index…and of course there would be transaction costs to pay! Hopefully this omission wont have too much of an effect as we don’t plan to be in and out of trades “too often”.
So in this model, our investor is either long the market, short the market or flat – this allows us to work with market returns and simply multiply the day’s market return by -1 if he is short, 1 if he is long and 0 if he is flat the previous day.
So we add yet another column to the DataFrame to hold the daily log returns of the index and then multiply that column by the ‘Stance’ column to get strategy returns:
sp500['Market Returns'] = np.log(sp500['Close'] / sp500['Close'].shift(1)) sp500['Strategy'] = sp500['Market Returns'] * sp500['Stance'].shift(1)
Note how we have shifted the sp[‘Close’] series down so that we are using the ‘Stance’ at the close of the previous day to calculate the return on the next day
Now we can plot the returns of the S&P500 versus the returns on the moving average crossover strategy on the same chart for comparison:
So we can see that although the strategy seems to perform rather well during market downturns, it doesn’t do so well during market rallies or when it is just trending upwards.
Over the test period it barely outperforms a simple buy and hold strategy, hardly enough to call it a “successful” strategy at least.
But there we have it; A simple moving average cross over strategy backtested in Python from start to finish in just a few lines of code!!