Home Trading Strategy Backtest Moving Average Crossover Trading Strategy Backtest in Python

Moving Average Crossover Trading Strategy Backtest in Python

by Stuart Jamieson

Hi all, for this post I will be building a simple moving average crossover trading strategy backtest in Python, using the S&P500 as the market to test on.

A simple moving average cross over strategy is possibly one of, if not the, simplest example of a rules based trading strategy using technical indicators so I thought this would be a good example for those learning Python; try to keep it as simple as possible and build up from there.

So as always when using Python for finacial data related shenanigans, it’s time to import our required modules:

import pandas as pd
import numpy as np
from pandas_datareader import data

We will first use the pandas-datareader functionality to download the price data from the first trading day in 2000, until today, for the S&P500 from Yahoo Finance as follows:

sp500 = data.DataReader('^GSPC', 'yahoo',start='1/1/2000')

Ok, lets do a quick check to see what format the data has been pulled down in.

s500.head()
Capture

Good stuff, so let’s create a quick plot of the closing prices to see how the S&P has performed over the period.

sp500['Close'].plot(grid=True,figsize=(8,5))
Capture

The trend strategy we want to implement is based on the crossover of two simple moving averages; the 2 months (42 trading days) and 1 year (252 trading days) moving averages.

Our first step is to create the moving average values and simultaneously append them to new columns in our existing sp500 DataFrame.

sp500['42d'] = np.round(sp500['Close'].rolling(window=42).mean(),2)
sp500['252d'] = np.round(sp500['Close'].rolling(window=252).mean(),2)

The above code both creates the series and automatically adds them to our DataFrame. We can see this as follows (I use the ‘.tail’ call here as the moving averages don’t actually hold values until day 42 and day 252 so wil just show up as ‘NaN’ in a ‘.head’ call):

sp500.tail
Capture

And here we see that indeed the moving average columns have been correctly added.

Now let’s go ahead and plot the closing prices and moving averages together on the same chart.

sp500[['Close','42d','252d']].plot(grid=True,figsize=(8,5))
Capture

Our basic data set is pretty much complete now, with all that’s really left to do is devise a rule to generate our trading signals.

We will have 3 basic states/rules:

1) Buy Signal (go long) – the 42d moving average is for the first time X points above the 252d tend.

2) Park in Cash – no position.

3) Sell Signal (go short) – the 42d moving average is for the first time X points below the 252d trend.

The first step in creating these signals is to add a new column to the DataFrame which is just the difference between the two moving averages:

sp500['42-252'] = sp500['42d'] - sp500['252d']

The next step is to formalise the signals by adding a further column which we will call Stance. We also set our signal threshold ‘X’ to 50 (this is somewhat arbitrary and can be optimised at some point)

X = 50
sp500['Stance'] = np.where(sp500['42-252'] > X, 1, 0)
sp500['Stance'] = np.where(sp500['42-252'] < -X, -1, sp500['Stance'])
sp500['Stance'].value_counts()

(n.b. there was an error in logic with the above lines of code when this post article was posted – so you will very possibly get significantly different results even if using the same inputs and time period of data as I have – the error was that I had omitted the minus sign in front of the “X” in the second line of code in the above code box – the error was kindly pointed out by Theodore in the comments section on 07/03/2019)

The last line of code above produces:

-1    2077
 1    1865
 0     251
Name: Stance, dtype: int64

Showing that during the time period we have chosen to backtest, on 2077 trading dates the 42d moving average lies more than 50 points below the 252d moving average, and on 1865 the 42d moving average lies more than 50 points above the 252d moving average.

A quick plot shows a visual representation of this ‘Stance’. I have set the ‘ylim’ (which is the y axis limits) to just above 1 and just below -1 so we can actually see the horizontal parts of the line.

sp500['Stance'].plot(lw=1.5,ylim=[-1.1,1.1])
Capture

Everything is now in place to test our investment strategy based upon the signals we have generated. In this instance we assume for simplicity that the S&P500 index can be bought or sold directly and that there are no transaction costs. In reality we would need to gain exposure to the index through ETFs, index funds or futures on the index…and of course there would be transaction costs to pay! Hopefully this omission wont have too much of an effect as we don’t plan to be in and out of trades “too often”.

So in this model, our investor is either long the market, short the market or flat – this allows us to work with market returns and simply multiply the day’s market return by -1 if he is short, 1 if he is long and 0 if he is flat the previous day.

So we add yet another column to the DataFrame to hold the daily log returns of the index and then multiply that column by the ‘Stance’ column to get strategy returns:

sp500['Market Returns'] = np.log(sp500['Close'] / sp500['Close'].shift(1))
sp500['Strategy'] = sp500['Market Returns'] * sp500['Stance'].shift(1)

Note how we have shifted the sp[‘Close’] series down so that we are using the ‘Stance’ at the close of the previous day to calculate the return on the next day

Now we can plot the returns of the S&P500 versus the returns on the moving average crossover strategy on the same chart for comparison:

sp500[['Market Returns','Strategy']].cumsum().plot(grid=True,figsize=(8,5))
Capture

So we can see that although the strategy seems to perform rather well during market downturns, it doesn’t do so well during market rallies or when it is just trending upwards.

Over the test period it barely outperforms a simple buy and hold strategy, hardly enough to call it a “successful” strategy at least.

But there we have it; A simple moving average cross over strategy backtested in Python from start to finish in just a few lines of code!!

You may also like

30 comments

sal 26 November 2016 - 20:05

HI I am having trouble with this line. By any chance would you be able to assist?

sp500[’42d’] = np.round(sp500[‘Close’].rolling(window=42).mean(),2)
sp500[‘252d’] = np.round(sp500[‘Close’].rolling(window=252).mean(),2)

Reply
s666 26 November 2016 - 23:46

Sure thing… What is it that you’re having problems with exactly? If you could provide a little bit more information, I’ll try to help…

Are you getting an error message? If you could post it here, I’ll take a look.

Reply
Sal 30 November 2016 - 03:25

Thank you very much for responding to my initial comment, I really appreciate it and I was able to solve the issue. (100% my fault) These tutorials are great. THANK YOU VERY MUCH AGAIN!!

I have another question/though about this back-test. If we were using shorter moving averages, would it be possible to create to following parameters:

(1) If the short moving average crosses above the long moving average go long for x days.
(2) if the short moving average crosses below the long moving average short for x days.
(3a) If there is an additional crossover during holding period ignore it
(3b) If there are not crossovers hold cash

Thanks,
Sal

Reply
s666 3 December 2016 - 13:16

Hi Sal, thanks for the kind words…happy to know my online ramblings are of help to at least one or two people!

Your questions are good ones, and ones that I am sure many people would have when looking into an MA cross over trading strategy. I have had a play around and I believe I have come up with something that will get you what you want. It’s not the fastest of code, and it sure ain’t the prettiest either but the final outcome follows the logic of what you have asked for…so here is it:

#import relevant modules
import pandas as pd
import numpy as np
from pandas_datareader import data
from math import sqrt
import matplotlib.pyplot as plt
%matplotlib inline
 
 
#download data into DataFrame and create moving averages columns
sp500 = data.DataReader('^GSPC', 'yahoo',start='1/1/2014')
sp500['42d'] = np.round(sp500['Close'].rolling(window=42).mean(),2)
sp500['252d'] = np.round(sp500['Close'].rolling(window=252).mean(),2)
 
#create column with moving average spread differential
sp500['42-252'] = sp500['42d'] - sp500['252d']
 
#set desired number of points as threshold for spread difference and create column containing strategy 'Stance'
X = 50
sp500['Stance'] = np.where(sp500['42-252'] > X, 1, 0)
sp500['Stance'] = np.where(sp500['42-252'] < -X, -1, sp500['Stance'])
sp500['Stance'].value_counts()
 
#create columns containing daily market log returns and strategy daily log returns
sp500['Market Returns'] = np.log(sp500['Close'] / sp500['Close'].shift(1))
sp500['Strategy'] = sp500['Market Returns'] * sp500['Stance'].shift(1)
#set up a new column to hold our stance relevant for the prespecified holding period
sp500['Stance2'] = 0
#set out predetermined holding period, after which time we will go back to holding cash and wait
#for the next moving average cross over - also we will ignore any extra crossovers during this holding period
days = 50
#iterate through the DataFrame and update the "Stance2" column to hold the revelant stance 
for i in range(X,len(sp500)):
    #logical test to check for 1) a cross over short over long MA 2) That we are currently in cash
    if (sp500['Stance'].iloc[i] > sp500['Stance'].iloc[i-1]) and (sp500['Stance'].iloc[i-1] == 0) and (sp500['Stance2'].iloc[i-1] == 0):
        #populate the DataFrame forward in time for the amount of days in our holding period
        for k in range(days):
            try:
                sp500['Stance2'].iloc[i+k] = 1
                sp500['Stance2'].iloc[i+k+1] = 0
            except:
                pass
    #logical test to check for 1) a cross over short under long MA 2) That we are currently in cash
    if (sp500['Stance'].iloc[i] < sp500['Stance'].iloc[i-1]) and (sp500['Stance'].iloc[i-1] == 0) and (sp500['Stance2'].iloc[i-1] == 0):
        #populate the DataFrame forward in time for the amount of days in our holding period
        for k in range(days):
            try:
                sp500['Stance2'].iloc[i+k] = -1
                sp500['Stance2'].iloc[i+k+1] = 0
            except:
                pass
    
#Calculate daily market returns and strategy daily returns
sp500['Market Returns'] = np.log(sp500['Close'] / sp500['Close'].shift(1))
sp500['Strategy'] = sp500['Market Returns'] * sp500['Stance2'].shift(1)
#plot strategy returns vs market returns
sp500[['Market Returns','Strategy']].cumsum().plot(grid=True,figsize=(8,5))
plt.show()
#set strategy starting equity to 1 (i.e. 100%) and generate equity curve
sp500['Strategy Equity'] = sp500['Strategy'].cumsum() + 1
 
#show chart of equity curve
sp500['Strategy Equity'].plot(grid=True,figsize=(8,5))
plt.show()

Couple of things to be aware of:

1) The "threshold" of the distance that the MA series need to diverge by to count as a "cross over" has been set at 50. This can be changed and optimised according to your own preferences. For example, if you wanted the MA lines to JUST cross to count as a "cross over" you could set the threshold (vairable X) to 1.

2) I have set the "days" variable to 50 - this is the holding period, and of course you can change this at will also.

Hope that helps and if you have any further questions, please do ask.

Reply
algo 13 December 2016 - 23:04

Thank you for the response. I am having some trouble understanding this piece of code. The code is working but I would like to better understand it. I am primarily confused with the iloc, and k and I. I really don’t understand what those are or where they are pulling information from. any clarity would be greatly appreciated!!

#iterate through the DataFrame and update the “Stance2” column to hold the revelant stance
for i in range(X,len(sp500)):
#logical test to check for 1) a cross over short over long MA 2) That we are currently in cash
if (sp500[‘Stance’].iloc[i] > sp500[‘Stance’].iloc[i-1]) and (sp500[‘Stance’].iloc[i-1] == 0) and (sp500[‘Stance2’].iloc[i-1] == 0):
#populate the DataFrame forward in time for the amount of days in our holding period
for k in range(days):
try:
sp500[‘Stance2’].iloc[i+k] = 1
sp500[‘Stance2’].iloc[i+k+1] = 0
except:
pass
#logical test to check for 1) a cross over short under long MA 2) That we are currently in cash
if (sp500[‘Stance’].iloc[i] < sp500['Stance'].iloc[i-1]) and (sp500['Stance'].iloc[i-1] == 0) and (sp500['Stance2'].iloc[i-1] == 0):
#populate the DataFrame forward in time for the amount of days in our holding period
for k in range(days):
try:
sp500['Stance2'].iloc[i+k] = -1
sp500['Stance2'].iloc[i+k+1] = 0
except:
pass

Reply
s666 14 December 2016 - 23:57

Hi there, no problem at all…glad to hear the code works as intended, at least.

In terms of your other questions regarding the “iloc” and the k and i, I think they may be best tackled in a separate blog post centered around that section of code specifically; it would be a little tough to explain it all properly in these comment boxes.

I’ll try my best to find some time this weekend and put something together for you that will hopefully make it a little clearer as to what the is actually doing etc

Until then…

Reply
sal 15 December 2016 - 00:27

THANK YOU!!!!

s666 18 December 2016 - 15:09

Hi Sal – please find the latest blog post which hopefully answers your questions at https://www.pythonforfinance.net/2016/12/18/moving-average-crossover-trading-strategy-backtest-python-v-2-0/

May I ask – are you and “algo” the same person? I see posts by both yourself and “algo” about the same topic.

Regards 😀

Optimisation of Moving Average Crossover Trading Strategy In Python 28 January 2017 - 19:58

[…] Staying on the same topic of optimisation that we visited in the last post concerning portfolio holdings and efficient frontiers/portfolio theory, I thought I would quickly revisit the moving average crossover strategy we built a few posts ago; the previous article can be found here. […]

Reply
Moving Average Crossover Trading Strategy Backtest in Python - V 2.0 11 March 2017 - 06:49

[…] Welcome back…this post is going to deal with a couple of questions I received in the comments section of a previous post, one relating to a moving average crossover trading strategy – the article can be found here. […]

Reply
Analysis of Moving Average Crossover Strategy Backtest Returns Using Pandas 11 March 2017 - 06:50

[…] of the results we got from the moving average crossover strategy backtest in the last post (can be found here), and spend a bit of time digging a little more deeply into the equity curve and producing a bit of […]

Reply
songhaegyo 10 July 2017 - 01:16

Hi there, I am having a problem with the import of data from yahoo using pandas.Could you please help?

File “C:\Python27\lib\site-packages\requests\adapters.py”, line 504, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host=’ichart.finance.yah
oo.com’, port=80): Max retries exceeded with url: /table.csv?a=0&ignore=.csv&s=%
5EGSPC&b=1&e=10&d=6&g=d&f=2017&c=2000 (Caused by NewConnectionError(‘: Failed to establish a new connect
ion: [Errno 11004] getaddrinfo failed’,))

Reply
s666 20 July 2017 - 07:13

Hi, thanks for the comment and apologies for the delay in replying, I have been travelling these past 2 weeks – unfortunately the Yahoo Finance API has been discontinued I believe and so no longer works with the Pandas DataReader. You could use a provider like Quandl instead – the syntax is slightly different and the data comes down in a slightly different format but with a few tweaks you can use it no problem. You will need to install the “quandl” module with “pip install” and then sign up to http://www.quandl.com. after that you can search for the contract you need and click the “Python” option under the “Export Data” in the top right of the page.

Have a go at that and if you need any extra guidance or clarification, do let me know!

Reply
famson 11 August 2017 - 18:19

Hello:
When I ran this code line: sp500[‘Strategy’] = sp500[‘Market Returns’] * sp500[‘Stance’].shift(1), I got this error: AttributeError: ‘numpy.ndarray’ object has no attribute ‘shift’
Please what do you think I am doing wrong

Reply
s666 11 August 2017 - 18:54

That’s very strange, sp500[‘Stance’] should be a “pandas.core.series.Series” not a “numpy.ndarray”.

Please try to run the code

type(sp500[‘Stance’])

and let me know what the output is.

Reply
famson 11 August 2017 - 19:47

I eventually sort this out.

Btw, please do you have a code to graphically represent Lake Ratio & Gain to Pain ratio of such a strategy as above?

Reply
s666 12 August 2017 - 09:39

The Gain to Pain ratio is an easy one to do – I’ve had a quick play around and have some code that calculates and creates a very simple bar chart of the Gain to Pain data. The Lake Ratio is however a much more complicated process…I would have to have a think and spend some time trying to get something put together.

As a start, here is the code for the Gain to Pain…

#import relevant modules
import pandas as pd
import numpy as np
from pandas_datareader import data
from math import sqrt
import matplotlib.pyplot as plt
%matplotlib inline
 
 
#download data into DataFrame and create moving averages columns
strategy = data.DataReader('PG', 'google',start='1/1/2000')
strategy['42d'] = np.round(strategy['Close'].rolling(window=42).mean(),2)
strategy['252d'] = np.round(strategy['Close'].rolling(window=252).mean(),2)
 
#create column with moving average spread differential
strategy['42-252'] = strategy['42d'] - strategy['252d']
 
#set desired number of points as threshold for spread difference and create column containing strategy 'Stance'
X = 50
strategy['Stance'] = np.where(strategy['42-252'] > X, 1, 0)
strategy['Stance'] = np.where(strategy['42-252'] < X, -1, strategy['Stance'])
strategy['Stance'].value_counts()
 
#create columns containing daily market log returns and strategy daily log returns
strategy['Market Returns'] = np.log(strategy['Close'] / strategy['Close'].shift(1))
strategy['Strategy'] = strategy['Market Returns'] * strategy['Stance'].shift(1)
 
#set strategy starting equity to 1 (i.e. 100%) and generate equity curve
strategy['Strategy Equity'] = strategy['Strategy'].cumsum() + 1
 
#show chart of equity curve
strategy['Strategy Equity'].plot()
#Resample the strategy into monthly data
stratm = pd.DataFrame(strategy['Strategy'].resample('M').sum())
#Add a month column for later translation
stratm['Month'] = stratm.index.month

#Calculate the sum of returns
sum_returns = strategy['Strategy'].sum()
#Calculate the sum of absolute negative monthly returns
sum_neg_months = abs(stratm[stratm < 0].sum().sum())
#Calculate the Gain to Pain ratio
gain_to_pain = sum_returns / sum_neg_months
#Create a DataFrame to hold Gain to Pain data
d = {'Sum of Returns': sum_returns,'Negative Sum':sum_neg_months,'Gain to Pain': gain_to_pain}
gain_to_pain_df = pd.DataFrame.from_dict(d,orient='index')
#Plot Gain to Pain data on a bar chart
gain_to_pain_df.plot.bar(legend=False)
Reply
famson 15 August 2017 - 06:23

Thank you. That was really helpful

Reply
famson 20 August 2017 - 08:06

Hello: Please one more problem, I am trying to plot simple graphical chart that shows the bearish and bullish period distinctly using the Exponential Moving average and create a new regime, etc. I will appreciate as I need more education on this.

Reply
s666 20 August 2017 - 09:17

Hi Famson, you can just use the Exponential Weighted Average method included in the Pandas library…

Take a look at: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.ewm.html

That explains its use.

So for example we could use the code:

sp500[‘Adj Close’].ewm(com=0.5).mean()

to get the exponential weighted average of the sp500 Adjusted Close using a “Centre of Mass” of 0.5.

If you wanted to plot it, just add “.plot()” at the end of the line above.

Hope that helps.

Reply
famson 22 August 2017 - 05:56

Yes it does. But what I was actually looking at is MA that use different colour and trendline for downward movement and upward movement.
In addition, I am also looking at pairs trade between these 2 indices using specific indicator.
Thank you

Reply
582407 25 October 2017 - 01:18

Thank you very much for this series of tutorials! I mean all your WORK! Excellent work! Keep it coming please!!!!!

Reply
Stephen 7 December 2017 - 17:20

Brilliant work indeed! Thank you very much. Would be nice if you could clarify my below doubt.

I have a csv file. 6 columns in the below format.

Date Stock 1 Price Stock 2 Price Stock 3 Price Stock 4 Price Market Index Price

But the thing here is I have the price data stored in csv on desktop. I would like to utilise mine instead of pulling from Yahoo. And yes, the thing is Stock 1 is the indicator. That is the whole strategy of crossover signal is obtained just from second column stock 1 price list. Based on this signal, stock 2, 3, 4 is purchased weighted equally. Could you kindly advice me on the code that I need to input as a replacement.

Also, I don’t want to bring in the short position.
Just long position and hold on to it – when short moving avg crosses above long moving average
Sell position entirely – when short moving avg crosses below long moving average; then buy back once again after 5 trading days.

I have been struggling a lot with the code as I’m a newbie in python. It would be really kind of you, if you assist me with the code. Thank you once again for the fantastic work of yours. Keep going.

Reply
S666 14 December 2017 - 18:01

Hi Stephen,

Apologies for the delay in replying – with regard to the request above – to read in a cvs file you can use pandas “read_csv”:

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

With regards to the other criteria specified, may i ask what you have come up with so far? If you post it, perhaps I can take a look through and suggest areas to modify etc.

I will reply to you via email too to see if I can help.

Cheers
S666

Reply
Stephen 15 December 2017 - 01:57

Hi, There is a small change in my problem. Below is the code I’m using. Just stuck up in the threshold part. That is, rebalance portfolio only if it deviates beyong the threshold say 5%. I would like to put this condition before initiating the rebalance so that it doesn’t rebalance every month for even a small deviation. Could you pls guide me. Thank you very much.

#http://pmorissette.github.io/bt/_modules/bt/algos.html

import bt

# fetch some data and also if out of these stocks, the recent listed date range is considered
data = bt.get(‘VTI, BND’,start=’2007,01,11′,end=’2017,01,11′)
print (data.head())

class OrderedWeights(bt.Algo):
def __init__(self, weights):
self.target_weights = weights

def __call__(self, target):
target.temp[‘weights’] = dict(zip(target.temp[‘selected’], self.target_weights))
return True

#commission
def my_comm(q, p):
return abs(q) * 0.5

# create the strategy & if you need it to run weekly the rebalancing use Weekly instead of Monthly
s = bt.Strategy(‘Portfolio1’, [bt.algos.RunMonthly(),
bt.algos.SelectAll(),
OrderedWeights([0.5, 0.5]),
bt.algos.Rebalance()])

# create a backtest and run it
test = bt.Backtest(s, data, initial_capital=10000, commissions=my_comm)
res = bt.run(test)

# first let’s see an equity curve
res.display()

res.plot()

# ok and how does the return distribution look like
res.plot_histogram()

# and just to make sure everything went along as planned, let’s plot the security weights over time
res.plot_security_weights()

Reply
spwcd 20 September 2018 - 05:43

just wanna know the reason when you sum up strategy return
why don’t you used np.exp to the log return ?

Reply
Theodore 7 March 2019 - 20:26

Hey, I am a bit confused about this part:

sp500[‘Stance’] = np.where(sp500[’42-252′] < X, -1, sp500[‘Stance’])

I think we should have taken the absolute value and changed sign to greater.

For example, if we have 100 and 80, that will be 20 which will be < 50 which was the limit. However do we want it like this? I thought we wanted only cases where say 50 -110 = -60 which is a sell.

Reply
s666 7 March 2019 - 20:40

Hi Theodore – you are indeed correct!! Thanks very much for pointing this out…it’s quite an egregious error on my part, as it’s an important part of the logic!!!

The line of code should read:

sp500[‘Stance’] = np.where(sp500[’42-252′] < -X, -1, sp500[‘Stance’])

I had omitted the minus sign in front of the "X" - we are indeed looking for the value of the 42 period MA minus the 252 period MA to be lower than MINUS 50!!

Again - thanks for bringing that to my attention - I shall change the code accordingly.

Reply
efueyo 2 July 2019 - 18:17

Thank you very much for providing us access to these tutorials. I am a retiree who learns python by studying the resources that he finds on the Internet. Trying to understand these script, I have the following questions.
a) .- What criteria should we follow to fix set our signal threshold ‘X’ ?. You use X = 50 for the SP_500. I have tested with “IBE.MC” and with quotes from two Investment Funds and, if the threshold is not 0 or very close to 0, practically all the result “stances” are zero and the whole post process of the scripts is a disaster.
b) .- The calculation of Volatility / Max Drawdown, always gives me the error “ZeroDivisionError: float division by zero”

I will appreciate any suggestions to set these concepts.

Reply
s666 7 July 2019 - 07:12

Hi there, apologies for the late reply. I will email you directly and help you with this, that will probably be easier than commenting back and forth. Check your inbox shortly 🙂

Reply

Leave a Reply

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

%d bloggers like this: