Python Backtesting – ETF Mean Reversion – creating the ticker pairs

Right, welcome back and sorry for the slight delay between posts…I’ve been way more busy than I had hoped. Onto our python backtesting!

So this, I guess, could be considered the first proper post regarding the ETF mean reversion backtest script we’re trying to come up with. In the last post we went over creating our SQLite database and populating it with the ETF data scraped from So we should now have over 1000 ETF tickers at our disposal to pull down and use in conjunction with the Pandas DataReader to pull daily pricing data from the web and use in our backtest.

I wanted to do it this way so that we didn’t just have a database full of unidentifiable ETF tickers, but rather we now have a whole raft of supporting information to go along with each one; underlying asset class, geographic region and “focus” to name but three. These categories will allow us to pull down the ETF tickers and create pairs that are more likley to be co-integrating due to underlying fundamental factors, rather than ploughing through untold permutations of random tickers in the hope that we stumble across something worthwhile. For example, 2 ETFs that track silver as an underlying asset are more likely to be co-integrated than one ETF that tracks silver and another that tracks utilities companies in the Asia pacific region right? Seems to make sense at least…

Let’s start some code!

If we now print out tickers we get a DataFrame as follows:

I’m going to then quickly iterate over the data in this DataFrame and append each item into a list so we can use them more easily later.

Now time to create a quick function that will take our list “symbList” as an input, and return a list of unique ticker pairs. We obviously want to work with pairs of tickers as that’s what we will be feeding into our main function for the backtest, which we will create later.

Now this may not be the neatest way to go about doing this, but it’s the best I’ve got at the moment and it does the job just fine so far. If however, some more advanced python enthusiasts stumble across this post and have some constructive criticism – well I’m always happy to learn a thing or two.

Once that’s defined, we can add:

And if we print symbPairs we will get:

Fantastic! We now have a full list of unique ticker pairs made up of ETFs that have “Silver” listed as their “Focus”.

This will be very useful when it comes to feeding inputs into our main backtesting function which we will create over the next few blog posts.

As always, any questions or comments please leave them below. I’m always interested in what others have to say 😀

Until next time…

It's only fair to share...Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInEmail this to someonePin on PinterestShare on Reddit
Written by s666