Home Basic Data Analysis Investment Portfolio Optimisation with Python – Revisited

Investment Portfolio Optimisation with Python – Revisited

by Stuart Jamieson

In this post I am going to be looking at portfolio optimisation methods, touching on both the use of Monte Carlo, “brute force” style optimisation and then the use of Scipy’s “optimize” function for “minimizing (or maximizing) objective functions, possibly subject to constraints”, as it states in the official docs (https://docs.scipy.org/doc/scipy/reference/optimize.html).

I have to apologise at this point for my jumping back and forth between the UK English spelling of the word “optimise” and the US English spelling (optimize)…my fingers just won’t allow me to type it with a “z” unless I absolutely have to, for some reason!!! When quoting the official docs or referring to the actual function itself I shall use a “z” to fall in line.

To set up the first part of the problem at hand – say we are building, or have a portfolio of stocks, and we wish to balance/rebalance our holdings in such as way that they match the weights that would match the “optimal” weights if “optimal” meant the portfolio with the highest Sharpe ratio, also known as the “mean-variance optimal” portfolio.

The first way I am going to attempt this is through a “brute force” style Monte Carlo approach. With this approach we try to discover the optimal weights by simply creating a large number of random portfolios, all with varying combinations of constituent stock weightings, calculating and recording the Sharpe ratio of each of these randomly weighted portfolio and then finally extracting the details corresponding to the result with the highest value.

The random weightings that we create in this example will be bound by the constraint that they must be between zero and one for each of the individual stocks, and also that all the weights must sum to one to represent an investment of 100% of our theoretical capital.

The more random portfolios that we create and calculate the Sharpe ratio for, theoretically the closer we get to the weightings of the “real” optimal portfolio. We will always experience some discrepancies however as we can never run enough simulated portfolios to replicate the exact weights we are searching for…we can get close, but never exact.

In this example we will create a portfolio of 5 stocks and run 100,000 simulated portfolios to produce our results. These results will then be plotted and both the “optimal” portfolio with the highest recorded Sharpe ratio and the “minimum variance portfolio” will be highlighted and marked for identification. The “minimum variance portfolio” is just what it sounds like, the portfolio with the lowest recorded variance (which also, by definition displays the lowest recorded standard deviation or “volatility”)

Let us start the code!

As always we begin by importing the required modules.

import pandas as pd  
import numpy as np
from pandas_datareader import data, wb
import datetime
import scipy.optimize as sco
from scipy import stats
import matplotlib.pyplot as plt
%matplotlib inline

We then download price data for the stocks we wish to include in our portfolio. In this example I have chosen 5 random stocks that I am sure most people will at least have heard of…Apple, Microsoft, Netflix, Amazon and Google.

tickers = ['AAPL', 'MSFT', 'NFLX', 'AMZN', 'GOOG']
start = datetime.datetime(2010, 1, 1)
end = datetime.datetime(2018, 12, 31)
df = pd.DataFrame([data.DataReader(ticker, 'yahoo', start, end)['Adj Close'] for ticker in tickers]).T
df.columns = tickers

The results will be produced by defining and running two functions (shown below). The first function (calc_portfolio_perf) is created to help us calculate the annualised return, annualised standard deviation and annualised Sharpe ratio of a portfolio, given that we pass it certain arguments of course. The arguments we will provide are, the weights of the portfolio constituents, the mean daily return of each of those constituents (as calculated over the historic data that we downloaded earlier), the co-variance matrix of the constituents and finally the risk free interest rate. The risk free rate is required for the calculation of the Sharpe ratio and should be provided as an annualised rate. In this example I have chosen to set the rate to zero, but the functionality is there to easily amend this for your own purposes.

The second function deals with the overall creation of multiple randomly weighted portfolios, which are then passed to the function we just described above to calculate the required values we wish to record. The values are then indeed recorded and once all portfolios have been simulated, the results are stored in and returned as a Pandas DataFrame.

The values recorded are as previously mentioned, the annualised return, annualised standard deviation and annualised Sharpe ratio – we also store the weights of each stock in the portfolio that generated those values.

def calc_portfolio_perf(weights, mean_returns, cov, rf):
    portfolio_return = np.sum(mean_returns * weights) * 252
    portfolio_std = np.sqrt(np.dot(weights.T, np.dot(cov, weights))) * np.sqrt(252)
    sharpe_ratio = (portfolio_return - rf) / portfolio_std
    return portfolio_return, portfolio_std, sharpe_ratio
def simulate_random_portfolios(num_portfolios, mean_returns, cov, rf):
    results_matrix = np.zeros((len(mean_returns)+3, num_portfolios))
    for i in range(num_portfolios):
        weights = np.random.random(len(mean_returns))
        weights /= np.sum(weights)
        portfolio_return, portfolio_std, sharpe_ratio = calc_portfolio_perf(weights, mean_returns, cov, rf)
        results_matrix[0,i] = portfolio_return
        results_matrix[1,i] = portfolio_std
        results_matrix[2,i] = sharpe_ratio
        #iterate through the weight vector and add data to results array
        for j in range(len(weights)):
            results_matrix[j+3,i] = weights[j]
            
    results_df = pd.DataFrame(results_matrix.T,columns=['ret','stdev','sharpe'] + [ticker for ticker in tickers])
        
    return results_df

Now we quickly calculate the mean returns and co-variance matrix of our list of stocks, set the number of portfolios we wish to simulate and finally we set the desired value of the risk free rate. We then call the required function and store the results in a variable so we can then extract and visualise them.

mean_returns = df.pct_change().mean()
cov = df.pct_change().cov()
num_portfolios = 100000
rf = 0.0
results_frame = simulate_random_portfolios(num_portfolios, mean_returns, cov, rf)

Below we visualise the results of all the simulated portfolios, plotting each portfolio by it’s corresponding values of annualised return (y-axis) and annualised volatility (x-axis), and also identify the 2 portfolios we are interested in. These are highlighted with a red star for the maximum Sharp ratio portfolio, and a green star for the minimum variance portfolio.

The data points are coloured according to their respective Sharpe ratios, with blue signifying a higher value, and red a lower value.

#locate position of portfolio with highest Sharpe Ratio
max_sharpe_port = results_frame.iloc[results_frame['sharpe'].idxmax()]
#locate positon of portfolio with minimum standard deviation
min_vol_port = results_frame.iloc[results_frame['stdev'].idxmin()]
#create scatter plot coloured by Sharpe Ratio
plt.subplots(figsize=(15,10))
plt.scatter(results_frame.stdev,results_frame.ret,c=results_frame.sharpe,cmap='RdYlBu')
plt.xlabel('Standard Deviation')
plt.ylabel('Returns')
plt.colorbar()
#plot red star to highlight position of portfolio with highest Sharpe Ratio
plt.scatter(max_sharpe_port[1],max_sharpe_port[0],marker=(5,1,0),color='r',s=500)
#plot green star to highlight position of minimum variance portfolio
plt.scatter(min_vol_port[1],min_vol_port[0],marker=(5,1,0),color='g',s=500)
plt.show()

Now we just take a look at the stock weightings that made up those two portfolios, along with the annualised return, annualised standard deviation and annualised Sharpe ratio. These are shown below firstly for the maximum Sharpe portfolio, and then for the minimum variance portfolio.

max_sharpe_port.to_frame().T
min_vol_port.to_frame().T

Next we begin the second approach to the optimisation – that uses the Scipy “optimize” functions. The code is fairly brief but there are a couple of things worth mentioning. Firstly, Scipy offers a “minimize” function, but no “maximize” function. Saying as we wish to maximise the Sharpe ration, this may seem like a bit of a problem at first glance, but it is easily solved by realising that the maximisation of the Sharpe ratio is analogous to the minimisation of the negative Sharpe ratio – that is literally just the Sharpe ratio value with a minus sign stuck at the front.

So firstly we define a function (very similar to our earlier function) that calculates and returns the negative Sharpe ratio of a portfolio.

Then we define a variable I have labelled “constraints”. This can look somewhat strange at first if you haven’t used the Scipy “optimize” capabilities before.

Let me run through each entry and hopefully clarify them somewhat:

Firstly, as we will be using the ‘SLSQP’ method in our “minimize” function (which stands for Sequential Least Squares Programming), the constraints argument must be in the format of a list of dictionaries, containing the fields “type” and “fun”, with the optional fields “jac” and “args”. We only need the fields “type”, “fun” and “args” so lets run through them.

The “type” can be either “eq” or “ineq” referring to “equality” or “inequality” respectively. The “fun” refers to the function defining the constraint, in our case the constraint that the sum of the stock weights must be 1. The way this needs to be entered is sort of a bit “back to front”. The “eq” means we are looking for our function to equate to zero (this is what the equality is in reference to – equality to zero in effect). So the most simple way to achieve this is to create a lambda function that returns the sum of the portfolio weights, minus 1. The constraint that this needs to sum to zero (that the function needs to equate to zero) by definition means that the weights must sum to 1. It’s admittedly a bit strange looking for some people at first, but there you go…

The “bounds” just specify that each individual stock weight must be between 0 and 1, with the “args” being the arguments that we want to pass to the function we are trying to minimise (calc_neg_sharpe) – that is all the arguments EXCEPT the weights vector which of course is the variable we are changing to optimise the output.

def calc_neg_sharpe(weights, mean_returns, cov, rf):
    portfolio_return = np.sum(mean_returns * weights) * 252
    portfolio_std = np.sqrt(np.dot(weights.T, np.dot(cov, weights))) * np.sqrt(252)
    sharpe_ratio = (portfolio_return - rf) / portfolio_std
    return -sharpe_ratio
constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
def max_sharpe_ratio(mean_returns, cov, rf):
    num_assets = len(mean_returns)
    args = (mean_returns, cov, rf)
    constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
    bound = (0.0,1.0)
    bounds = tuple(bound for asset in range(num_assets))
    result = sco.minimize(calc_neg_sharpe, num_assets*[1./num_assets,], args=args,
                        method='SLSQP', bounds=bounds, constraints=constraints)
    return result
optimal_port_sharpe = max_sharpe_ratio(mean_returns, cov, rf)

When we run the optimisation, we get the following results:

pd.DataFrame([round(x,2) for x in optimal_port_sharpe['x']],index=tickers).T

When we compare this output with that from our Monte Carlo approach we can see that they are similar, but of course as explained above they will not be identical. The weightings of each stock are not more than a couple of percent away between the two approaches…hopefully that indicates we did something right at least!

We can then just use the same approach to identify the minimum variance portfolio. It’s almost the same code as above although this time we need to define a function to calculate and return the volatility of a portfolio, and use it as the function we wish the minimise (“calc_portfolio_std”). This time there is no need to negate the output of our function as it is already a minimisation problem this time (as opposed to the Sharpe ratio when we wanted to find the maximum)

The constraints remain the same, so we just adapt the “max_sharpe_ratio” function above, rename it to “min_variance” and change the “args” variable to hold the correct arguments that we need to pass to our new “calc_portfolio_std” that we are minimising.

def calc_portfolio_std(weights, mean_returns, cov):
    portfolio_std = np.sqrt(np.dot(weights.T, np.dot(cov, weights))) * np.sqrt(252)
    return portfolio_std
def min_variance(mean_returns, cov):
    num_assets = len(mean_returns)
    args = (mean_returns, cov)
    constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
    bound = (0.0,1.0)
    bounds = tuple(bound for asset in range(num_assets))
    result = sco.minimize(calc_portfolio_std, num_assets*[1./num_assets,], args=args,
                        method='SLSQP', bounds=bounds, constraints=constraints)
    return result
min_port_variance = min_variance(mean_returns, cov)

When we run the optimisation, we get the following results:

pd.DataFrame([round(x,2) for x in min_port_variance['x']],index=tickers).T

Again we see the results are very close to those we were presented with when using the Monte Carlo approach.

Great stuff so far! Now let us move on to the problem of identifying the portfolio weights that minimise the Value at Risk (VaR).

The logic is very similar to that followed when dealing with the first Monte Carlo problem above, so I will try to identify the changes and differences only rather than repeat myself too much. We start again by creating our two functions – but this time instead of one that returns portfolio return, volatility and Sharpe ratio, it returns the parametric portfolio VaR to a confidence level determined by the value of the “alpha” argument (confidence level will be 1 – alpha), and to a time scale determined by the “days” argument.

The method I have chosen to use for the VaR calculation is to scale the portfolio standard deviation by the square root of the “days” value, then subtract the scaled standard deviation, multiplied by the relevant “Z value” according to the chosen value of “alpha” from the portfolio daily mean returns which have been scaled linearly according to the “days” value. This final VaR value has then been converted to an absolute value, as VaR is more often than not reported as a positive value (it also allows us to run the required “minimization” function when it is cast as a positive value).

As a note, VaR is sometimes calculated in such a way that the mean returns of the portfolio are considered to be small enough that they can be entered into the equation with a zero value – this tends to make more sense when we are looking at VaR over short time periods like a daily or a weekly VaR figure, however when we start to look at annualised VaR figures it begins to make more sense to incorporate a “non-zero” return element.

Finally, the above approach where returns are entered as zero (effectively removing them from the calculation) is sometimes favoured as it is a more “pessimistic” view of a portfolio’s VaR and when dealing with the quantification of risk, or in fact any “downside” forecast, it is wise to err on the side of caution and make decisions based on a worst case scenario. The cost of being wrong due to underestimating VaR and that due to overestimating VaR is almost never symmetric – there is almost always a higher cost to an underestimation.

The second function is pretty much analogous to the one used for the Sharpe optimisation with some slight changes to variable names, parameters and arguments passed of course.

def calc_portfolio_perf_VaR(weights, mean_returns, cov, alpha, days):
    portfolio_return = np.sum(mean_returns * weights) * days
    portfolio_std = np.sqrt(np.dot(weights.T, np.dot(cov, weights))) * np.sqrt(days)
    portfolio_var = abs(portfolio_return - (portfolio_std * stats.norm.ppf(1 - alpha)))
    return portfolio_return, portfolio_std, portfolio_var

def simulate_random_portfolios_VaR(num_portfolios, mean_returns, cov, alpha, days):
    results_matrix = np.zeros((len(mean_returns)+3, num_portfolios))
    for i in range(num_portfolios):
        weights = np.random.random(len(mean_returns))
        weights /= np.sum(weights)
        portfolio_return, portfolio_std, portfolio_VaR = calc_portfolio_perf_VaR(weights, mean_returns, cov, alpha, days)
        results_matrix[0,i] = portfolio_return
        results_matrix[1,i] = portfolio_std
        results_matrix[2,i] = portfolio_VaR
        #iterate through the weight vector and add data to results array
        for j in range(len(weights)):
            results_matrix[j+3,i] = weights[j]
            
    results_df = pd.DataFrame(results_matrix.T,columns=['ret','stdev','VaR'] + [ticker for ticker in tickers])
        
    return results_df

Similar variables are defined as before this time with the addition of “days” and “alpha”. The “days” variable determines the time frame over which the VaR figure is calculated/scaled and the “alpha” variable is the significance level used for the calculation (with confidence level being (1 – significance level) as mentioned just above).

I have chosen 252 days (to represent a year’s worth of trading days) and an alpha of 0.05, corresponding to a 95% confidence level. So that is to say we will be calculating the one-year 95% VaR, and attempting to minimise that value.

Now let’s run the simulation function and plot the results again.

mean_returns = df.pct_change().mean()
cov = df.pct_change().cov()
num_portfolios = 100000
rf = 0.0
days = 252
alpha = 0.05
results_frame = simulate_random_portfolios_VaR(num_portfolios, mean_returns, cov, alpha, days)

This time we plot the results of each portfolio with annualised return remaining on the y-axis but the x-axis this time representing the portfolio VaR (rather than standard deviation). The plot colours the data points according to the value of VaR for that portfolio.

#locate positon of portfolio with minimum VaR
min_VaR_port = results_frame.iloc[results_frame['VaR'].idxmin()]
#create scatter plot coloured by VaR
plt.subplots(figsize=(15,10))
plt.scatter(results_frame.VaR,results_frame.ret,c=results_frame.VaR,cmap='RdYlBu')
plt.xlabel('Value at Risk')
plt.ylabel('Returns')
plt.colorbar()
#plot red star to highlight position of minimum VaR portfolio
plt.scatter(min_VaR_port[2],min_VaR_port[0],marker=(5,1,0),color='r',s=500)
plt.show()

The weights of the resulting minimum VaR portfolio is as shown below.

min_VaR_port.to_frame().T

So far so good it seems…what happens if we plot the location of the minimum VaR portfolio on a chart with the y-axis as return and the x-axis as standard deviation as before? The data points are still coloured according to their corresponding VaR value. Let’s take a look.

#locate positon of portfolio with minimum VaR
min_VaR_port = results_frame.iloc[results_frame['VaR'].idxmin()]
#create scatter plot coloured by VaR
plt.subplots(figsize=(15,10))
plt.scatter(results_frame.stdev,results_frame.ret,c=results_frame.VaR,cmap='RdYlBu')
plt.xlabel('Standard Deviation')
plt.ylabel('Returns')
plt.colorbar()
#plot red star to highlight position of minimum VaR portfolio
plt.scatter(min_VaR_port[1],min_VaR_port[0],marker=(5,1,0),color='r',s=500)
plt.show()

Now you might notice at this point that the results of the minimum VaR portfolio simulations look pretty similar to those of the maximum Sharpe ratio portfolio but that is to be expected considering the calculation method chosen for VaR.

The VaR calculation was:

And the calculation of the Sharpe ratio was:

From this we can see that VaR falls when portfolio returns increase and vice versa, whereas the Sharpe ratio increases as portfolio returns increase – so what minimises VaR in terms of returns actually maximises the Sharpe ratio.

Similarly, an increase in portfolio standard deviation increases VaR but decreases the Sharpe ratio – so what maximises VaR in terms of portfolio standard deviation actually minimises the Sharpe ratio.

Saying as we are looking for the minimum VaR and the maximum Sharpe, it makes sense that they will be be achieved with “similar” portfolios.

Now we move onto the second approach to identify the minimum VaR portfolio. Again the code is rather similar to the optimisation code used to calculate the maximum Sharpe and minimum variance portfolios, again with some minor tweaking.

We need a new function that calculates and returns just the VaR of a portfolio, this is defined first. Nothing changes here from our original function that calculated VaR, only that we return a single VaR value rather than the three original values (that previously included portfolio return and standard deviation).

The “min_VaR” function acts much as the “max_sharpe_ratio” and “min_variance” functions did, just with some tweaks to alter the arguments as needed. The constraints are the same, as are the bounds etc.

constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
def calc_portfolio_VaR(weights, mean_returns, cov, alpha, days):
    portfolio_return = np.sum(mean_returns * weights) * days
    portfolio_std = np.sqrt(np.dot(weights.T, np.dot(cov, weights))) * np.sqrt(days)
    portfolio_var = abs(portfolio_return - (portfolio_std * stats.norm.ppf(1 - alpha)))
    return portfolio_var
def min_VaR(mean_returns, cov, alpha, days):
    num_assets = len(mean_returns)
    args = (mean_returns, cov, alpha, days)
    constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
    bound = (0.0,1.0)
    bounds = tuple(bound for asset in range(num_assets))
    result = sco.minimize(calc_portfolio_VaR, num_assets*[1./num_assets,], args=args,
                        method='SLSQP', bounds=bounds, constraints=constraints)
    return result
min_port_VaR = min_VaR(mean_returns, cov, alpha, days)

When we run the optimisation, we get the following results:

pd.DataFrame([round(x,2) for x in min_port_VaR['x']],index=tickers).T

Once again we see the results are very close to those we were presented with when using the Monte Carlo approach, with the weights being within a couple of percent of each other.

So there you have it, two approaches(Monte Carlo “brute force” and use of Scipy’s “minimize” function) to optimise a portfolio of stocks based on minimising different cost functions ( i.e. the negative Sharpe ratio, the variance and the Value at Risk).

I hope that has been somewhat interesting to some of you at least..until next time!

You may also like

48 comments

Ivan Hudec 2 July 2019 - 08:03

Excellent thank you.

Reply
s666 2 July 2019 - 08:07

Hi Ivan, many thanks for the comment- you’re very welcome 😉

Reply
Scott 3 July 2019 - 11:20

Great work, thanks!
Would love to see a comparison of historical returns & metrics using the various optimization approaches to historically holding different portfolios of assets classes (say ETFs) over time, rebalanced monthly.

Reply
s666 4 July 2019 - 00:01

Hi Scott, thanks for your comment. Sounds like a nice idea to run some historical comparisons of the differing portfolio suggestions, see if the reality bares out the same as the theory. I could run some “walk forward” optimisation, running the analysis each month and then holding that optimal portfolio for the following month so there is no “look forward bias” as it were.

I’ll get on to this as soon as I have a free moment.

Reply
Chris 3 July 2019 - 15:04

Awesome work very well explained, thank you!
I second Scott, it would be interesting to see a backtest of the various optimizations 😉
and may I aks you what matplotlib theme do you use?
Thanks

Reply
s666 4 July 2019 - 00:05

Hi Chris, thanks for your comment also…I will make that the subject of my next post. It’s always nice to have things suggested by readers, so many thanks for that.

In terms of the theme I used, it wasn’t a mtplotlib theme per se, but rather a Jupiter Notebook theme using the following package; https://github.com/dunovank/jupyter-themes

I am also planning to do a couple of posts on environments used for coding so this will definitely be explained in there shortly also.

Reply
Chris 4 July 2019 - 10:23

Thank you very much for your quick answer.
Looking forward to see your future publications 😉

Reply
Birdy 7 July 2019 - 21:07

Very, very good s666 :-). Just one small note — You did forget to include: pd.DataFrame([round(x,2) for x in min_port_variance[‘x’]],index=tickers).T

Reply
s666 7 July 2019 - 21:48

Thanks Birdy, well spotted! It has been amended and added…thanks!

Reply
jojo 10 September 2019 - 12:21

hello, for the MC optimization is it possible to apply other constraints such as sector constraints for a portfolio that has 100+ plus names? is it possible to share a sample of the code for sector constraints and how to incorporate into existing MC code?
many thanks

Reply
s666 22 September 2019 - 11:10

Hi jojo, apologies for the late reply… To assign sector constraints etc should be possible of course, it would depend on you having the data of which stock related to which sector. If you have this data available I would be happy to take a look and see if I can create what you have described. If so, ping me a message here and I will send you my contact details to forward the data file on to.

Reply
Cristovam Peres 11 September 2019 - 03:08

Hi Stuart! Congratulations for your work.Very inspiring.

Going foward, did you even tried implementing the Black-Litterman model using Python?

Reply
s666 22 September 2019 - 11:08

Hi Cristovam apologies for the late reply, actually I havnt yet but it was something I’ve been thinking about doing. Is it something you would be particularly interested in seeing?

Reply
Mandar Priya Phatak 13 September 2019 - 10:48

Sir,
I have just started my journey in Python, and i met with error in the first step, like pandas_datareader is not working anymore, so is there some other library for the getting the data from yahoo finance. Will be waiting for your reply

Reply
s666 22 September 2019 - 11:06

Apologies for the late reply… What was the error you are receiving? The pandas data reader is currently still working so you should be able to use it.

Reply
Chris 24 September 2019 - 16:27

Hello there!

First of all this code is awesome and works exactly the way I would want a portfolio optimization setup to work. Great work, appreciate your time to create.

Second, I wanted to know how difficult it would be to implement a $ value of the capital and constrain it such that it has to chose funds with a minimum fund amount (i.e. vanguard funds require minimum of $3000). I know currently there is no dollars involved in terms of portfolio amount, but this is the piece I am looking to add on.

Let me know your thoughts,

Thanks gain!

Reply
s666 4 October 2019 - 04:40

Hi Chris, perhaps you could specify a starting portfolio value and then create a constraint such that the percentage held in any asset must equate to a certain absolute value in terms of dollars… So if you had a portfolio starting value of 100,000 and the minimum you wanted was 3,000 as mentioned, you could just set the constraint at 3%.

Reply
anarchy 30 October 2019 - 20:09

Hi, great article, was wondering how you would modify your code if you wanted to include short positions

Reply
s666 4 November 2019 - 13:44

Sure thing – it should be possible with the code below:

Firstly add this function:

from random import uniform as rand
def randConstrained(n, m, low, high):
    tot = m
    if not low < = 0 <= high:
        raise ValueError("Cannot guarantee a solution when the input does not allow for 0s")
    answer = []
    for _ in range(n-1):
        answer.append(low + rand(0,tot) * (high-low))
        tot -= answer[-1]
    answer.append(m-sum(answer))
    return answer

and then change the code in the "simulate_random_portfolios" function so that instead of the lines:

weights = np.random.random(len(mean_returns))
weights /= np.sum(weights)

you have (for example - with 5 stocks that you want to sum to a weight of 1, with any individual stock being allowed to range from -1 to 1:

weights = np.asarray(randConstrained(5,1,-1,1))

You can ofcourse change the n,m,low, high arguments to fit your requirements.

I havnt tested for any bugs this may introduce further down the line - but this solves the first problem at least!!! 😉

Reply
Greg Burke 19 November 2019 - 12:54

For the annualized returns, how come you are not raise the returns to 252?

I get annualized vol, but is their a syntax or finance reason its not

def calc_portfolio_perf(weights, mean_returns, cov, rf):
portfolio_return = (( 1+ np.sum(mean_returns * weights)) ** 252 ) – 1

Reply
Siri 22 November 2019 - 23:47

Hello,

Thank you very much for publishing this! Its easy to follow and very helpful.

I have two questions about the second method of optimization using the minimize function.

1- When calling the ‘calc_portfolio_std’ function in sco.minimize, where are the “weights” variables being passed on from? After running the code, I printed out what those weights were, and they were different form the weights resulting from the minimum variance function.

2- If I wanted to add a portfolio tracking error constraint to the minimum variance function, how can I incorporate that in the code? Given that I have certain benchmark returns and weights for the same stocks in my portfolio

Reply
Grdl 20 March 2020 - 19:35

How can I plot AAPL, MSFT, GOOGL portfolios with 1 individually to see their individual risk and return?

Reply
s666 24 March 2020 - 06:51

If just considering one single stock I guess the risk and return would just be the historic CAGR and the annualised standard deviation of the stock returns no?

Reply
tutor estinth dzir 21 March 2020 - 21:52

Hi,
I have many difficulties to introduce the “Short” possibility. I can’t find how to tel to the program that weights can take value between -1;1
Can You help me ?

Reply
s666 24 March 2020 - 06:49

Hi there, it depends whether you are working with the monte carol style random portfolio method, or the method using the scipy “optimize” approach. Which one are you trying yo implement please?

Reply
Kevin 9 April 2020 - 15:53

Hi,
Is it possible to include dividends on returns?
Thanks,

Reply
Piotr Arendarski 12 April 2020 - 12:23

Thank you S666 for another solid piece of financial code in Python! I really like your professional, storytelling-like approach for optimisation and previous topic.

Dear Mandar,
There have been some changes in ‘data reader’ library.

You can use this piece of code a modify accordingly:

import library

import pandas_datareader as pdr

#set dates
start = datetime.datetime(2018, 3, 1)
end = datetime.datetime(2018, 12, 31)

#fetch data
cme = pdr.get_data_yahoo(‘CME’, start, end)

you can also easily use data feed from stooq.com or stooq.pl – you will find more macro data there i guess.
cme = pdr.get_data_stooq(‘CME’, start, end)

I hope you will find this helpful.

Piotr

Reply
AK 15 April 2020 - 07:56

Hi and thank you for sharing this!

I have two questions for which your advice would be much appreciated:
1. How can I provide my own historical data from a csv or spreadsheet file instead of reading from on online source?
2. What happens if the starting date of the timeseries of the securities/instruments used is not matching?
e.g. let’s say that one instrument starts only in 2010 while another starts in 2005. How will the return calculations and the correlation matrix take this into account?

Keep up the good work!

Best,
AK

Reply
Raphael Mettle 27 April 2020 - 00:57

Impressive work! Thanks for the intellectually stimulating content. In the calculation of the portfolio standard deviation, where do you factor the multiplication of the constant ‘2’ in the calculus?

Reply
Raphael 27 April 2020 - 01:22

I remember it now, deriving the formula for modern portfolio theory. Thanks for the impressive work.

Reply
Matias Aguirre 23 June 2020 - 06:06

You should to a youtube video on this!!!

Reply
Cyril Quijoux 20 July 2020 - 15:49

Hi Greg,

I think you are right, it seems there is a small mistake regarding the annualization of the returns. Multiplying by 252 is only right if we’re dealing with log returns but it’s not the case here.

Anyway, it’s a great and inspiring article

Reply
anarchy 28 July 2020 - 09:05

wow i did not get any notification for you reply.. haha.. i just saw it. is there a way to add shorting for only selected securities?

Reply
Carlo P. 14 August 2020 - 00:58

Thanks for the great post!
It would also be nice if you can update the code adding a constraint for minimum % holding position and a max % holding position. I.e. the max you can allocate for each stock is 20%..

Thanks much!

Reply
Ryan 27 August 2020 - 19:02

You look like a remarkable dad! That is a tremendous accomplishment!! Congrats!!

Reply
s666 27 August 2020 - 19:25

In what way am I a remarkable dad? I’m sorry, Im not understanding…

Reply
Mariano Miró 7 October 2020 - 03:54

Excellent analysis. You obviously have a deep understanding of finance and programming. I am just starting with programming and I want to deepen my knowledge in data analysis and financial analysis. This helped me a lot. Thank you so much for sharing it.

Reply
Gus 14 November 2020 - 18:11

Hello Stuart,
I’m trying to follow this amazing investment tutorial/Python-code, and in my PC (Linux/Python 3.6.9), it runs well till it reaches the “localization of the portfolio with minimum VaR” (after the random portfolios simulation). It fails there with the following error code:
“/home/ni/.local/lib/python3.6/site-packages/pandas/core/indexing.py”, line 1493, in _getitem_axis
raise TypeError(“Cannot index by location index with a non-integer key”)
Have you, or any of the people on this forum, had this issue?
Thank you for your time,
Gus

Reply
s666 14 November 2020 - 18:30

Hi Gus – I assume you are referring to the line that reads:

#locate positon of portfolio with minimum VaR
min_VaR_port = results_frame.iloc[results_frame[‘VaR’].idxmin()]

The error message is telling you that you are trying to use a label based key but the method you are using only accepts an integer as an index key.

You notice the use of “.iloc” – the i stands for “integer” and the loc stands for “location” – using “iloc” requires that you pass it an integer, which seemingly you are not.

Either you have made a typo and used an integer key with “.loc” (notice the lack of i) which only accepts label based keys, or vice versa you are using a label with iloc.

Hopefully that makes sense – let me know if you cant resolve it 😉

Reply
G 16 November 2020 - 21:45

Hi Stuart,
thank you for your comments.
(I understand the “panda-restrictions” about the “i.loc”.)
Anyway, I started from scratch, and got (not null) values for VaR (results_frame).
So, the “min-VaR_port” calculation run without complains.
Regards,
Gus

Reply
Youri Cillien 19 November 2020 - 19:54

Hey Stuart,
Hats off for this superb article. It is a pleasure to read for someone who isn’t as proficient in Python yet, because the explanations for the different lines of code are extremely helpful.

I just have a few issues when running the code. I am trying to do the exact same thing as you do in the first approach but with 24 different stocks. Everything runs fine except for the fact that my graph looks off and it doesn’t have the typical minimum variance frontier. I am not able to post a picture here so it might be difficult to illustrate, but basically my graph looks more like a circle with the different portfolio points. Any guess what the problem could be?
Your help would mean a lot.

Cheers,
Youri

Reply
s666 19 November 2020 - 19:58

If you would like to post your code here I am happy to take a look. If possible try to get it correctly formatted as python code by wrapping it with:

<_pre lang="python"_>

</pre>

at the start and end – NOTE: DONT include the underscores at the start and end of each line -I have just added them to allow the actual wrappers to be visible and not changed into HTML themselves…

Reply
Youri Cillien 22 November 2020 - 17:54

Hello,
I have actually been working on it since my original post and it now looks a lot better. The frontier is visible. My guess is that it was due to the fact that too many ‘Adj. Close’ values were missing (probably because I didn’t choose the correct ticker), which I then replaced using a simple Forward Fill.

I do have a different question though, related to the individual stock weights. Is it possible to cap the weights at 8% so that no stock is attributed more than that and further that the excess weight is then evenly distributed to other stocks. If yes, how can I implement this using the code you provided.
I know this question has been asked under a different article of yours, but I couldn’t find the answer yet.

Thank you very much for taking the time to help out.

Cheers,
Youri

Reply
s666 23 November 2020 - 06:58

Hi Youri – A very quick way to do it would be to change you “bounds” within the “max_sharpe_ratio” function. Change it from “bound = (0.0,1.0)” to “bound = (0.0,0.08)”.

That will set an upper bound of 8% on each holding. I’m not certain the outcome will be EXACTLY as it would be if you strictly followed the method of “evenly distributing to other stocks” but this will get you closer to what could be considered “mean-variance” efficient, with your required upper bound of 8%.

Hope that helps.

Reply
Youri Cillien 30 November 2020 - 18:54

Hi Stuart,
Thanks a lot, it worked!
Cheers,
Youri

Reply
Simon 21 November 2021 - 11:08

The weights are not between the range -1 and 1…

Do you have another solution?

Reply
s666 21 November 2021 - 14:54

Honestly, I am not even sure if it can be done – and if it can, it is by no means a trivial problem when negative weights are allowed and then also must be forced to sum to 1.0. I have played around and there are solutions where by there are only positive weights of individual stocks which can sum to 1.0 of course.

There are also solutions whereby negative weights are allowed but the result sums to equality, i.e. zero. This problem is really solved by allowing a portion of the weights to be determined by the values of the earlier weights, hence is no longer “random” in the true sense.

I’ve hacked together something, although I am not certain it is really fit for purpose. here it is anyway as you may find it helpful:
//

import random
n = 10
nums = [] 
while not (nums and sum(nums) == 0 and all( -1 <= v <= 1 for v in nums)):
    for i in range(n-1):
        nums.append(random.uniform(-1,1))     
    nums.append(-sum(nums))
# The line below is the adjustment
nums = [x+(1/n) for x in nums]
print(nums, sum(nums), '\n', all(-1<=v<=1 for v in nums))

//
That will produce a list of floating point numbers between -1 and 1 that all sum to 0.0.

If you really want it to sum to 1 then the only thing I can think of on the spot is to then just iterate over the numbers and adjust them accordingly, by adding 1/n to each of them. However this risks pushing some of the values up above the 1.0 threshold so you may want to come up with your own adjustment method.

Just be reminded this is a pretty horrible hack and you may well find that when run through a large number of simulations the numbers that are produced end up being skewed massively one way or another and the lack of randomness is revealed. Best I can do just for now I am afraid.

Reply
Oliver 9 December 2021 - 20:36

Hello S666,

let’s imagine we would like to implement a buy and hold strategy for the above portfolio (tickers = [‘AAPL’, ‘MSFT’, ‘NFLX’, ‘AMZN’, ‘GOOG’]), based on the weighting of the Maximum Sharp Ratio portfolio.

How would the code look like?

Best
Nicole

Reply

Leave a Reply

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

%d