In this article I wanted to concentrate on some basic time series analysis, and on efforts to see if there is any simple way we can improve our prediction skills and abilities in order to produce more accurate results. When considering most financial asset price time series you would be forgiven for concluding that, at various time frames (some longer, some shorter) many, many of the data sets we try to analyse can appear completely random. At least random enough that any hope of easily forecasting future value and paths is going to be a tough ask at the every least!
Basic Data Analysis
In this post I am going to be looking at portfolio optimisation methods, touching on both the use of Monte Carlo, “brute force” style optimisation and then the use of Scipy’s “optimize” function for “minimizing (or maximizing) objective functions, possibly subject to constraints”, as it states in the official docs (https://docs.scipy.org/doc/scipy/reference/optimize.html).
I have to apologise at this point for my jumping back and forth between the UK English spelling of the word “optimise” and the US English spelling (optimize)…my fingers just won’t allow me to type it with a “z” unless I absolutely have to, for some reason!!! When quoting the official docs or referring to the actual function itself I shall use a “z” to fall in line.
To set up the first part of the problem at hand – say we are building, or have a portfolio of stocks, and we wish to balance/rebalance our holdings in such as way that they match the weights that would match the “optimal” weights if “optimal” meant the portfolio with the highest Sharpe ratio, also known as the “mean-variance optimal” portfolio.
In this article I thought I would take a look at and compare the concepts of “Monte Carlo analysis” and “Bootstrapping” in relation to simulating returns series and generating corresponding confidence intervals as to a portfolio’s potential risks and rewards.
Both methods are used to generate simulated price paths for a given asset, or portfolio of assets but they use slightly differing methods, which can appear reasonably subtle to those who haven’t come across them before. Technically Bootstrapping is a special case of the Monte Carlo simulation, hence why it may seem a little confusing at first glance.
With Monte Carlo analysis (and here we are talking specifically about the “Parametric” Monte Carlo approach) the idea is to generate data based upon some underlying model characteristics. So for example, we generate data based upon a Normal distribution, specifying our desired inputs to the model, in this case being the mean and the standard deviation. Where do we get these input figures from I hear you ask…well more often than not people tend to use values based on the historic, realised values for the assets in question.
This blog post is a result of a request I received on the website Facebook group page from a follower who asked me to analyse/play around with a csv data file he had provided. The request was to use Pandas to wrangle the data and perform some filtering and aggregation, with the view to plot the resulting figures using Matplotlib. Now Matplotlib was explicitly asked for, rather than Seaborn or any other higher level plotting library (even if they are built on the Matplotlib API) so I shall endeavour to use base Matplotlib where possible, rather than rely on any of the aforementioned (more user friendly) modules.
Well it’s time for part 4 of our mini-series outlining how to create a program to generate performance reports in nice, fancy looking HTML format that we can render in our browser and interact with (to a certain extent). The previous post can be found here. If you copy and paste the last iteration of the code for “main.py” and “template.html” from the last post into your own local files and recreate the folder and file structure outline in part 1 (which can be found here), then you should be ready to follow on from here pretty much.
So I promised at the end of the last post that I would stop adding random charts and tables with additional KPIs and equity curves and what not, and try to add a bit of functionality that one may actually find useful even if it weren’t part of this whole specific performance report creation tutorial. I know many people are interested in the concept of Monte Carlo analysis and the insights it can offer above and beyond those statistics and visuals created from the actual return series of the investment/trading strategy under inspection.
This is the third part of the current “mini-series” providing a walk-through of how to create a “Report Generation” tool to allow the creation and display of a performance report for our (backtest) strategy equity series/returns.
To recap, the way we left the code and report output at the end of the last blog post is shown below. The “main.py” file looked like this:
This is the second part of the current “mini-series” providing a walk-through of how to create a “Report Generation” tool to allow the creation and display of a performance report for our (backtest) strategy equity series/returns.
As long as the equity series (and an optional benchmark equity series) are formatted in the correct manner and dropped into the “data” folder in csv format, it will eventually take no more than a click of a button and we will be able to produce in-depth, interactive strategy performance reports. This will be invaluable when it comes to filtering out the “wheat from the chaffe” in terms of prototype trading strategy backtest results. We wont have to recreate our analysis efforts again and again, rather we just run them through this program and the hard work is done for us.
To recap, the way we left the code and report output at the end of the last blog post is shown below:
I’ve been thinking about the topic for the next series of blog posts, and after a bit of deliberation I’ve decided to create a multi-part series of articles with a walk through of how to create a customisable HTML trading strategy report generator.
That is, once all is done and dusted all that will be required is to create a csv file with your trading strategy equity curve data in one column, and an (optional) benchmark equity series in a second column, place it in a particular folder, click a couple of buttons and “Hey Presto!” out will pop an HTML file which can be rendered in your browser and will contain all sorts of charts, statistics and analysis on your particular strategy performance.
Before we get down to any actual performance analysis and calculation of relevant stats etc, we first need to create a quick “skeleton” report which will contain all the necessary files, modules and logic to generate the most basic of HTML output files, using a simple “placeholder” variable to make sure things are working.
I know at this stage what I am saying may not make much sense, but all will become clear shortly.
Firstly we need to create the necessary folder structure along with some files which we will be using as we go along.
Carrying on from the last blog post, I am now going to shift attention to plotting categorical data with Seaborn. So let’s write our first few lines of code that deals with the import of various packages and loads our excel file into a DataFrame. The excel file we are using can be downloaded by clicking the download link below.
import pandas as pd import seaborn as sns #if using Jupyter Notebooks the below line allows us to display charts in the browser %matplotlib inline #load our data in a Pandas DataFrame df = pd.read_excel('Financial Sample.xlsx') #set the style we wish to use for our plots sns.set_style("darkgrid") #print first 5 rows of data to ensure it is loaded correctly df.head()
I thought for this post I would look into the Seaborn library – Seaborn is a statistical plotting library and is built on top of Matplotlib. It has really nice looking default plotting styles and also works really well with Pandas DataFrames – so we can leverage the work we have done with Pandas in previous blog posts and hopefully create some great plots.
Seaborn can be installed just like any other Python package by using “pip”. Go to your command line and run:
pip install seaborn
The official documentation page for Seaborn can be found here and a lovely looking gallery page showing examples of what is possible with Seabon can be found here. You can click on any of the images on the gallery page and it will present you with example code on how to produce that particular plot. Another important page is the API page, which references the various available plot types – this can be found here.
I am going to try to break the Seaborn capabilities down into various categories – and begin with the plots that allow us to visualise the distribition of a data set