# Intraday Stock Mean Reversion Trading Backtest in Python

**Intraday Stock Mean Reversion Trading Backtest in Python**

After completing the series on creating an inter-day mean reversion strategy, I thought it may be an idea to visit another mean reversion strategy, but one that works on an intra-day scale. That is, we will be looking for the mean reversion to take place within one trading day.

Stock prices tend to follow geometric random walks, as we are often reminded by countless financial scholars; but this is true only if we test their price series for mean reversion strictly at regular intervals, such as using their daily closing price. Our job is to find special conditions where mean reversion occurs with regularity. As the following strategy will show, there may indeed be seasonal mean reversion occurring at the intra-day time frame for stocks.

The strategy rules are as follows:

1) Select all stocks near the market open whose returns from their previous day’s lows to today’s opens are lower than one standard deviation. The standard deviation is computed using the daily close-to-close returns of the last 90 days. These are stocks that “gapped down”.

2) Narrow down this list of stocks by requiring that their open prices be higher than the 20-day moving average of the closing prices.

3) Liquidate the positions at the market close.

So we will first begin with our necessary module imports as follows:

import pandas as pd import numpy as np from pandas_datareader import data from math import sqrt |

I will be running this backtest using the NYSE stock universe which contains 3159 stock – you can download the ticker list by clicking on the download button below.

NYSE Stock List

Once you have that file stored somewhere, we can feed it in using pandas, and set up our stock ticker list as follows:

#make sure the NYSE.txt file is in the same folder as your python script file stocks = pd.read_csv('NYSE.txt',delimiter="\t") #set up our empty list to hold the stock tickers stocks_list = [] #iterate through the pandas dataframe of tickers and append them to our empty list for symbol in stocks['Symbol']: stocks_list.append(symbol) |

As a quick check to see if they have been fed in correctly:

len(stocks_list) |

should produce

`3159` |

and

stocks_list[:5] |

Should produce:

['A', 'AA', 'AAC', 'AAN', 'AAP'] |

Ok great, so now we have our list of stocks that we wish to use as our “investment universe” – we can begin to write the code for the actual backtest.

The logic of our approach is as follows…we will iterate through the list of stock tickers, each time we will download the relevant price data into a DataFrame and then add a couple of columns to help us create signals as to when our two criteria are met (gap down of larger than 1 90 day rolling standard deviation and an opening price above the 20 day moving average).

We will then use these signals to create our return series for that stock, and then store that information by appending each stocks return series to a list. Finally we will concatenate all those return series into a master DataFrame and calculate our overall daily return.

#create empty list to hold our return series DataFrame for each stock frames = [] for stock in stocks_list: try: #download stock data and place in DataFrame df = data.DataReader(stock, 'yahoo',start='1/1/2000') #create column to hold our 90 day rolling standard deviation df['Stdev'] = df['Close'].rolling(window=90).std() #create a column to hold our 20 day moving average df['Moving Average'] = df['Close'].rolling(window=20).mean() #create a column which holds a TRUE value if the gap down from previous day's low to next #day's open is larger than the 90 day rolling standard deviation df['Criteria1'] = (df['Open'] - df['Low'].shift(1)) < -df['Stdev'] #create a column which holds a TRUE value if the opening price of the stock is above the 20 day moving average df['Criteria2'] = df['Open'] > df['Moving Average'] #create a column that holds a TRUE value if both above criteria are also TRUE df['BUY'] = df['Criteria1'] & df['Criteria2'] #calculate daily % return series for stock df['Pct Change'] = (df['Close'] - df['Open']) / df['Open'] #create a strategy return series by using the daily stock returns where the trade criteria above are met df['Rets'] = df['Pct Change'][df['BUY'] == True] #append the strategy return series to our list frames.append(df['Rets']) except: pass |

Now this stock list has over 3000 stocks in it, so expect this code to take a bit of time to run…I believe mine took about 15-20 minutes to run when I tried it, so try to be a bit patient.

Once the code has run and we have our list filled with all the individual strategy return series for each stock, we have to concatenate them all into a master DataFrame and then calculate the overall daily strategy return. This can be done as follows:

#concatenate the individual DataFrames held in our list- and do it along the column axis masterFrame = pd.concat(frames,axis=1) #create a column to hold the sum of all the individual daily strategy returns masterFrame['Total'] = masterFrame.sum(axis=1) #create a column that hold the count of the number of stocks that were traded each day #we minus one from it so that we dont count the "Total" column we added as a trade. masterFrame['Count'] = masterFrame.count(axis=1) - 1 #create a column that divides the "total" strategy return each day by the number of stocks traded that day to get equally weighted return. masterFrame['Return'] = masterFrame['Total'] / masterFrame['Count'] |

So now we have a return series that holds the strategy returns based on trading the qualifying stocks each day, in equal weight. If 2 stocks qualified, we would weight each stock at 50% in our portfolio for example.

So all that’s left to do now, is to plot the equity curve and calculate a rough Sharpe Ratio and annual return.

masterFrame['Return'].dropna().cumsum().plot() |

The Sharpe Ratio (excluding the risk free element for simplicity) can be calculated as follows:

(masterFrame['Return'].mean() *252) / (masterFrame['Return'].std() * (sqrt(252))) |

which gets us:

`2.176240875776992` |

and the annual return can be calculated as:

(masterFrame['Return'].dropna().cumsum()[-1]+1)**(365.0/days) - 1 |

Which gets us:

`0.088146958591373892` |

So a Sharpe Ratio of over 2 and an annual return of around 8.8% – that’s not too shabby!!

Of course, we have to remember that we are not taking into account any transaction costs so those returns could be quite heavily effected in a real world setting. Also, this strategy logic assumes we can buy the stocks that have gapped down exactly at their opening price, and assumes we always achieve the closing (settlement) price on selling at the end of the day, which of course wouldn’t be the case.

I’ll leave it up to you guys and girls to delve more deeply into the strategy returns – you can use my previous blog post where I analysed the returns of our moving average crossover strategy as inspiration. That post can be found here

Hi,

Thanks for the post. We are working on a high performance data analytics framework in python and would like to use your codes as examples. Are we allowed to use the material? Is there a license for this material?

Thanks,

Ehsan

Hi Ehsan – thanks for the kind words. I write this blog just for my own amusement, so no license is needed to re-use the code, please feel free to do so. All I would ask is that, if possible, you reference my blog as the source so that I may possibly attract more traffic. That’s up to you though 😉

I would be very interested to see the outcome of/hear more about your project, it sounds very interesting!

Of course, I’ll add a reference to this post. Here is the link to the example in the project: https://github.com/IntelLabs/hpat/blob/master/examples/intraday_mean.py

HPAT will compile this code (with minimal changes) automatically to run efficiently on clusters.

Looks great! Thanks for the mention too…much appreciated!

Hi S666 I was using your codes to test

I noticed something because this is taking Open to Close change, the line below should add a shift(1)?

df[‘Criteria2’] = df[‘Open’] > df[‘Moving Average’].shift(1)

Because if you dont you will be taking in today close price (But we are buying at Open and cannot possibly know today close prices)

*I am pulling data from my database but you data source may have accounted for this already if so pls ignore me thanks

Hi Jerrickng – good spot, I believe you are correct. If we are buying at the open price based upon the opening price being higher than the moving average, and we are using closing prices to calculate the moving average, we are in effect suffering from look forward bias as in real time we would not know the close price to use in the moving average calculation.

I shall change the code as soon as I get a moment.

nice blog!! …The best that I found about Python being used in Finance!!!

The only model which closely approximates financial markets is Geometric Brownian movement(GBM).Distance travelled under GBM is proportional to square root of time interval. Positive & negative shocks cancel each other over time in A diversified portfolio of stocks. On A net basis one can rarely beat the markets. According to option formula for A given stock S, if one month option costs 1 dollar then 4 month option on the same stock costs only 2 dollars because square root of 4 is two. Indirect way of stating this is that for A given time period chances that this stock would travel distance of 1d is 4 times compared to travelling distance of 2d.Option formulas may not be perfect 100%, but are damn good because trillions of dollars of derivatives are traded every day based on option formulas & market makers do not go bankrupt—whether they make market in puts or calls & stay out of speculation.

My question is whether following strategy is possibly sound in trading using computerized trading by A fund manager–

Computer puts in following order on stock “ S”.On the same ticket take profit & stop loss orders are always on the same side of current market price that day & not on opposite sides of current stock price.

1) Below the current price “P” put an order to buy that stock at “ P minus 1d” with take profit at “P minus1/2 d” & a stop loss at “P minus 2d”.This order is entered every day based on current price that day until executed whether at profit or with a loss–& same process is repeated on diversified portfolio of stocks all by computer with no human intervention. Similar orders are placed on the upside to sell short every day based on current prices that day using the same principals by the computer.No directional bet is ever made.

2)Stock prices go through noise every day on intraday basis. Chances that buy order would get filled at distance of “P minus 1D” is 4 times compared to hitting stop loss at “ P minus 2D” within same period of time on the same ticket order. With intraday noise, reversion to the mean, take profit order would get hit more times than stop loss on the same ticket order.

3) Under GBM, out of 4 episodes, 3 times there would be profit earned of “1/2d” each & one time there would be loss of “ 1d”with net profit of “½ d” on these 4 executions over & over again both on the downside as well as on the upside. Unfilled orders are cancelled every day when stock exchange closes. New orders are entered every morning based on CURRENT PRICE of the stock that day. Distance d is adjusted depending upon historical volatility of the stock so that decent number of orders are getting executed—if too many orders are getting executed then value of “d” is increased to slow down executions.With decent number of executions laws of averages would apply. Risk is controlled by controlling how many stock orders are placed both on the upside & downside. No directional bet any time—all orders are non-directional ,automatic & computer generated based on current volatility.Risk is also controlled by trading smaller amount of fund assets relative to total assets.

With low transactional costs ,fund manager would make money.

I would greatly appreciate your input into this strategy