Moving Average Crossover Trading Strategy Backtest in Python

Moving Average Crossover Trading Strategy Backtest in Python

Categories Trading Strategy Backtest

Hi all, for this post I will be building a simple moving average crossover trading strategy backtest in Python, using the S&P500 as the market to test on.

A simple moving average cross over strategy is possibly one of, if not the, simplest example of a rules based trading strategy using technical indicators so I thought this would be a good example for those learning Python; try to keep it as simple as possible and build up from there.

So as always when using Python for finacial data related shenanigans, it’s time to import our required modules:

import pandas as pd
import numpy as np
from pandas_datareader import data

We will first use the pandas-datareader functionality to download the price data from the first trading day in 2000, until today, for the S&P500 from Yahoo Finance as follows:

sp500 = data.DataReader('^GSPC', 'yahoo',start='1/1/2000')

Ok, lets do a quick check to see what format the data has been pulled down in.

s500.head()

Capture

Good stuff, so let’s create a quick plot of the closing prices to see how the S&P has performed over the period.

sp500['Close'].plot(grid=True,figsize=(8,5))

Capture

The trend strategy we want to implement is based on the crossover of two simple moving averages; the 2 months (42 trading days) and 1 year (252 trading days) moving averages.

Our first step is to create the moving average values and simultaneously append them to new columns in our existing sp500 DataFrame.

sp500['42d'] = np.round(sp500['Close'].rolling(window=42).mean(),2)
sp500['252d'] = np.round(sp500['Close'].rolling(window=252).mean(),2)

The above code both creates the series and automatically adds them to our DataFrame. We can see this as follows (I use the ‘.tail’ call here as the moving averages don’t actually hold values until day 42 and day 252 so wil just show up as ‘NaN’ in a ‘.head’ call):

sp500.tail

Capture

And here we see that indeed the moving average columns have been correctly added.

Now let’s go ahead and plot the closing prices and moving averages together on the same chart.

sp500[['Close','42d','252d']].plot(grid=True,figsize=(8,5))

Capture

Our basic data set is pretty much complete now, with all that’s really left to do is devise a rule to generate our trading signals.

We will have 3 basic states/rules:

1) Buy Signal (go long) – the 42d moving average is for the first time X points above the 252d tend.

2) Park in Cash – no position.

3) Sell Signal (go short) – the 42d moving average is for the first time X points below the 252d trend.

The first step in creating these signals is to add a new column to the DataFrame which is just the difference between the two moving averages:

sp500['42-252'] = sp500['42d'] - sp500['252d']

The next step is to formalise the signals by adding a further column which we will call Stance. We also set our signal threshold ‘X’ to 50 (this is somewhat arbitrary and can be optimised at some point)

X = 50
sp500['Stance'] = np.where(sp500['42-252'] > X, 1, 0)
sp500['Stance'] = np.where(sp500['42-252'] < X, -1, sp500['Stance'])
sp500['Stance'].value_counts()

The last line of code above produces:

-1    2077
 1    1865
 0     251
Name: Stance, dtype: int64

Showing that during the time period we have chosen to backtest, on 2077 trading dates the 42d moving average lies more than 50 points below the 252d moving average, and on 1865 the 42d moving average lies more than 50 points above the 252d moving average.

A quick plot shows a visual representation of this ‘Stance’. I have set the ‘ylim’ (which is the y axis limits) to just above 1 and just below -1 so we can actually see the horizontal parts of the line.

sp500['Stance'].plot(lw=1.5,ylim=[-1.1,1.1])

Capture

Everything is now in place to test our investment strategy based upon the signals we have generated. In this instance we assume for simplicity that the S&P500 index can be bought or sold directly and that there are no transaction costs. In reality we would need to gain exposure to the index through ETFs, index funds or futures on the index…and of course there would be transaction costs to pay! Hopefully this omission wont have too much of an effect as we don’t plan to be in and out of trades “too often”.

So in this model, our investor is either long the market, short the market or flat – this allows us to work with market returns and simply multiply the day’s market return by -1 if he is short, 1 if he is long and 0 if he is flat the previous day.

So we add yet another column to the DataFrame to hold the daily log returns of the index and then multiply that column by the ‘Stance’ column to get strategy returns:

sp500['Market Returns'] = np.log(sp500['Close'] / sp500['Close'].shift(1))
sp500['Strategy'] = sp500['Market Returns'] * sp500['Stance'].shift(1)

Note how we have shifted the sp[‘Close’] series down so that we are using the ‘Stance’ at the close of the previous day to calculate the return on the next day

Now we can plot the returns of the S&P500 versus the returns on the moving average crossover strategy on the same chart for comparison:

sp500[['Market Returns','Strategy']].cumsum().plot(grid=True,figsize=(8,5))

Capture

So we can see that although the strategy seems to perform rather well during market downturns, it doesn’t do so well during market rallies or when it is just trending upwards.

Over the test period it barely outperforms a simple buy and hold strategy, hardly enough to call it a “successful” strategy at least.

But there we have it; A simple moving average cross over strategy backtested in Python from start to finish in just a few lines of code!!

It's only fair to share...Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInEmail this to someonePin on PinterestShare on Reddit

9 thoughts on “Moving Average Crossover Trading Strategy Backtest in Python

  1. HI I am having trouble with this line. By any chance would you be able to assist?

    sp500[’42d’] = np.round(sp500[‘Close’].rolling(window=42).mean(),2)
    sp500[‘252d’] = np.round(sp500[‘Close’].rolling(window=252).mean(),2)

    1. Sure thing… What is it that you’re having problems with exactly? If you could provide a little bit more information, I’ll try to help…

      Are you getting an error message? If you could post it here, I’ll take a look.

  2. Thank you very much for responding to my initial comment, I really appreciate it and I was able to solve the issue. (100% my fault) These tutorials are great. THANK YOU VERY MUCH AGAIN!!

    I have another question/though about this back-test. If we were using shorter moving averages, would it be possible to create to following parameters:

    (1) If the short moving average crosses above the long moving average go long for x days.
    (2) if the short moving average crosses below the long moving average short for x days.
    (3a) If there is an additional crossover during holding period ignore it
    (3b) If there are not crossovers hold cash

    Thanks,
    Sal

    1. Hi Sal, thanks for the kind words…happy to know my online ramblings are of help to at least one or two people!

      Your questions are good ones, and ones that I am sure many people would have when looking into an MA cross over trading strategy. I have had a play around and I believe I have come up with something that will get you what you want. It’s not the fastest of code, and it sure ain’t the prettiest either but the final outcome follows the logic of what you have asked for…so here is it:

      #import relevant modules
      import pandas as pd
      import numpy as np
      from pandas_datareader import data
      from math import sqrt
      import matplotlib.pyplot as plt
      %matplotlib inline
       
       
      #download data into DataFrame and create moving averages columns
      sp500 = data.DataReader('^GSPC', 'yahoo',start='1/1/2014')
      sp500['42d'] = np.round(sp500['Close'].rolling(window=42).mean(),2)
      sp500['252d'] = np.round(sp500['Close'].rolling(window=252).mean(),2)
       
      #create column with moving average spread differential
      sp500['42-252'] = sp500['42d'] - sp500['252d']
       
      #set desired number of points as threshold for spread difference and create column containing strategy 'Stance'
      X = 50
      sp500['Stance'] = np.where(sp500['42-252'] > X, 1, 0)
      sp500['Stance'] = np.where(sp500['42-252'] < -X, -1, sp500['Stance'])
      sp500['Stance'].value_counts()
       
      #create columns containing daily market log returns and strategy daily log returns
      sp500['Market Returns'] = np.log(sp500['Close'] / sp500['Close'].shift(1))
      sp500['Strategy'] = sp500['Market Returns'] * sp500['Stance'].shift(1)
       
      #set up a new column to hold our stance relevant for the prespecified holding period
      sp500['Stance2'] = 0
       
      #set out predetermined holding period, after which time we will go back to holding cash and wait
      #for the next moving average cross over - also we will ignore any extra crossovers during this holding period
      days = 50
       
      #iterate through the DataFrame and update the "Stance2" column to hold the revelant stance 
      for i in range(X,len(sp500)):
          #logical test to check for 1) a cross over short over long MA 2) That we are currently in cash
          if (sp500['Stance'].iloc[i] > sp500['Stance'].iloc[i-1]) and (sp500['Stance'].iloc[i-1] == 0) and (sp500['Stance2'].iloc[i-1] == 0):
              #populate the DataFrame forward in time for the amount of days in our holding period
              for k in range(days):
                  try:
                      sp500['Stance2'].iloc[i+k] = 1
                      sp500['Stance2'].iloc[i+k+1] = 0
                  except:
                      pass
          #logical test to check for 1) a cross over short under long MA 2) That we are currently in cash
          if (sp500['Stance'].iloc[i] < sp500['Stance'].iloc[i-1]) and (sp500['Stance'].iloc[i-1] == 0) and (sp500['Stance2'].iloc[i-1] == 0):
              #populate the DataFrame forward in time for the amount of days in our holding period
              for k in range(days):
                  try:
                      sp500['Stance2'].iloc[i+k] = -1
                      sp500['Stance2'].iloc[i+k+1] = 0
                  except:
                      pass
       
       
      #Calculate daily market returns and strategy daily returns
      sp500['Market Returns'] = np.log(sp500['Close'] / sp500['Close'].shift(1))
      sp500['Strategy'] = sp500['Market Returns'] * sp500['Stance2'].shift(1)
       
      #plot strategy returns vs market returns
      sp500[['Market Returns','Strategy']].cumsum().plot(grid=True,figsize=(8,5))
      plt.show()
       
      #set strategy starting equity to 1 (i.e. 100%) and generate equity curve
      sp500['Strategy Equity'] = sp500['Strategy'].cumsum() + 1
       
      #show chart of equity curve
      sp500['Strategy Equity'].plot(grid=True,figsize=(8,5))
      plt.show()

      Couple of things to be aware of:

      1) The “threshold” of the distance that the MA series need to diverge by to count as a “cross over” has been set at 50. This can be changed and optimised according to your own preferences. For example, if you wanted the MA lines to JUST cross to count as a “cross over” you could set the threshold (vairable X) to 1.

      2) I have set the “days” variable to 50 – this is the holding period, and of course you can change this at will also.

      Hope that helps and if you have any further questions, please do ask.

      1. Thank you for the response. I am having some trouble understanding this piece of code. The code is working but I would like to better understand it. I am primarily confused with the iloc, and k and I. I really don’t understand what those are or where they are pulling information from. any clarity would be greatly appreciated!!

        #iterate through the DataFrame and update the “Stance2” column to hold the revelant stance
        for i in range(X,len(sp500)):
        #logical test to check for 1) a cross over short over long MA 2) That we are currently in cash
        if (sp500[‘Stance’].iloc[i] > sp500[‘Stance’].iloc[i-1]) and (sp500[‘Stance’].iloc[i-1] == 0) and (sp500[‘Stance2’].iloc[i-1] == 0):
        #populate the DataFrame forward in time for the amount of days in our holding period
        for k in range(days):
        try:
        sp500[‘Stance2’].iloc[i+k] = 1
        sp500[‘Stance2’].iloc[i+k+1] = 0
        except:
        pass
        #logical test to check for 1) a cross over short under long MA 2) That we are currently in cash
        if (sp500[‘Stance’].iloc[i] < sp500['Stance'].iloc[i-1]) and (sp500['Stance'].iloc[i-1] == 0) and (sp500['Stance2'].iloc[i-1] == 0):
        #populate the DataFrame forward in time for the amount of days in our holding period
        for k in range(days):
        try:
        sp500['Stance2'].iloc[i+k] = -1
        sp500['Stance2'].iloc[i+k+1] = 0
        except:
        pass

        1. Hi there, no problem at all…glad to hear the code works as intended, at least.

          In terms of your other questions regarding the “iloc” and the k and i, I think they may be best tackled in a separate blog post centered around that section of code specifically; it would be a little tough to explain it all properly in these comment boxes.

          I’ll try my best to find some time this weekend and put something together for you that will hopefully make it a little clearer as to what the is actually doing etc

          Until then…

Leave a Reply