Hi all, this is the second part to the “Trading Strategy Analysis using Python and the FFN Package” post (the first part can be found here).
Last time we went over the use of the “PerformanceStats” object in ffn, whereas this time I want to concentrate on the “GroupStats” object. The former is for use with single series of data, whereas the latter can accept a full DataFrame with multiple columns representing multiple price/equity series simultaneously.
Let’s now create some new data – this time lets set up a DataFrame containing 4 columns of data, each representing a trading strategy (or asset price) ‘equity curve’n and display the first 5 rows of that data.
num_days = 1000 data1 = pd.DataFrame((np.random.randn(num_days) + np.random.uniform(low=0.0, high=0.2, size=num_days)),index=index,columns=['Data1']) data2 = pd.DataFrame((np.random.randn(num_days) + np.random.uniform(low=0.0, high=0.2, size=num_days)),index=index,columns=['Data2']) data3 = pd.DataFrame((np.random.randn(num_days) + np.random.uniform(low=0.0, high=0.2, size=num_days)),index=index,columns=['Data3']) data4 = pd.DataFrame((np.random.randn(num_days) + np.random.uniform(low=0.0, high=0.2, size=num_days)),index=index,columns=['Data4']) data = pd.concat([data1,data2,data3,data4],axis=1) data = data.cumsum() + 100 data.iloc[0] = 100 data.head()

The first step is identical to the first step for a single series of returns that we showed in part one of this tutorial – we change it into an ffn object, this time a “GroupStats” objectand assign that to a variable.
perf = data.calc_stats()
If we print the type of the variable we have just created – we can indeed see it is an “ffn.core.GroupStats” object – when dealing with a single series of returns we created an “ffn.core.PerformanceStats” object, which is worth noting.
print(type(perf))
<class ‘ffn.core.GroupStats’>
Plotting the multiple returns series is similar also:
perf.plot()

We can also run the same commands on a pandas DataFrame of stock price series we have downloaded from Yahoo Finance:
import pandas_datareader.data as web stocks = ['AAPL','AMZN','MSFT','NFLX'] data = web.DataReader(stocks,data_source='yahoo',start='01/01/2010')['Adj Close'] data.sort_index(ascending=True,inplace=True) perf = data.calc_stats()
If we were to plot the data now as follows – you will notice that the prices are automatically rebased to 100 at the start, in order to make them more comparable:
perf.plot()

Let’s stick with using opur downloaded real stock price data from now on – we can display stats very quickly for them all at the same time using the same call as for a single series:
perf.display()

This table of information can be accessed as a normal Pandas DataFrame using the folkowing syntax (it is also index-able as normal):
perf.stats

And can be indexed, for example, as follows:
perf.stats.loc['cagr']

Up until now the GroupStats object has generated output very similar to the PerformanceStats object – so you may be wondering why have them both if they both just do the same things! Well let’s move onto some “GroupStats” specific functionality.
Let’s begin by creating a DataFrame containing a the series of log returns for each of the stock price curves in our “data” DataFrame.
returns = data.to_log_returns().dropna() print(returns.head())

We can quickly create a multiplot of each of the series’ log returns histograms along with screting a DataFrame containing the correlation matrix for them also.
ax = returns.hist(figsize=(20, 10),bins=30)

returns.corr().as_format('.2f')

If you want something more visually appealing than the above DataFrame, then it’s just as easy to create a heatmap of the correlation matrix instead:
returns.plot_corr_heatmap()

There we go – that heatmap shows an output that is a little more relevant than our previous randomly generated data. Let’s stick with using this data downloaded from Yahoo Finance for the moment rather than the random data. We can very quickly output the “optimal portolio” based on classic Markowitz Mean/Variance Optimisation methods:
returns.calc_mean_var_weights().as_format('.2%')

The lookback returns info can be accessed and indexed as follows:
perf.display_lookback_returns().loc['mtd']
A nice looking scatterplot matrix can easily be produced as follows:
perf.plot_scatter_matrix()

We can also plot drawdown series for multiple assets simultaneously as follows:
ffn.to_drawdown_series(data).plot(figsize=(15,10))

I’ll leave it here for this post, I hope some of you found the above useful…there is further functionality available from the FFN package, but I’ll leave that up to you all to explore the more obscure parts of the module.
Until next time…
16 comments
Hi S666 just wanted to say love the blog and articles. They are pitched at just the right level for people like me.. strong python programers with weak quant skills 😉
I pick up much better with examples and this intro to ffn is perfect.
I notice that those guys have a backtesting framework ‘bt’ that leverages ffn.
Have you used that?
its only recently I’m ashamed to say that I’ve been using pandas and all that stuff for my python systems. I’ve always crafted loops/ event based back testers.
Hi Stephen, thanks for the kind words – glad you like the blog!
In terms of “bt” backtester framework – I have heard of it, but never actually dug into it for use…perhaps I should take a look and make that the subject of my next few posts – would you find that helpful at all?
If you find the bt stuff interesting and don’t have a backlog of other topics I would find it very interesting.
Of course, look forward. To your next post anyway whatever it is. Cheers
Clear demo of the FFN package. Thank you S666 :).
is the wheel file available for ffn?
Hi Kannan – I had a quick search around Google and couldn’t find anything. The package can be downloaded from the relevant Github – or installed using pip….are you not able to just use the “pip” command to install it?
Do you have a substitute for yahoo finance? I’m getting the following error:
ImmediateDeprecationError:
Yahoo Daily has been immediately deprecated due to large breaks in the API without the
introduction of a stable replacement. Pull Requests to re-enable these data
connectors are welcome.
I was using the Quandl wiki data set but I received an email from them. They are having quality problems with the data set also.
Have you ever used data from IEX? Any other ideas?
Thank you. If I use monthly returns, how can I use ffn package to do the similar analysis?
Thank you very much for these very complete exercises. I only have one problem, the sentence
perf.plot_scatter_matrix()
it returns the error
AttributeError: module ‘pandas’ has no attribute ‘scatter_matrix’
Best regareds
Hi – I have just tried to run the code myself and am getting the same error – strange as it definitely ran without problems at the time of writing the post!
I have raised an “issue” on the relevant ffn Github page. Feel free to follow the progress at:
https://github.com/pmorissette/ffn/issues/76
Hopefully they can shed some light in the matter!
pretty useful stuff! found out about the ffn package here, saves us a ton of time. thanks!
Hey S666, I noticed that after moving from jupyter to my local IDE (im using spyder), i dont see any plots being printed on my terminal here. Do you know why this happens?
Are you explicitly calling “plt.show()” to call the figure?
Also in ipython you need to run “%matplotlib inline” just as in a notebook I believe
Hi drawdown and calc mean var weights doesn’t work. Can u help me?
Hi there – when you say they dont work what exactly happens when you try to run them? Do you get an error message? If so could you post it here and I will take a look for you.