Investment Portfolio Optimisation with Python

In this post I’ll be looking at investment portfolio optimisation with python, the fundamental concept of diversification and the creation of an efficient frontier that can be used by investors to choose specific mixes of assets based on investment goals; that is, the trade off between their desired level of portfolio return vs their desired level of portfolio risk.

Investopedia defines “Portfolio Theory” as:

“Modern Portfolio Theory (MPT), a hypothesis put forth by Harry Markowitz in his paper “Portfolio Selection,” (published in 1952 by the Journal of Finance) is an investment theory based on the idea that risk-averse investors can construct portfolios to optimize or maximize expected return based on a given level of market risk, emphasizing that risk is an inherent part of higher reward. It is one of the most important and influential economic theories dealing with finance and investment.

Also called “portfolio theory” or “portfolio management theory,” MPT suggests that it is possible to construct an “efficient frontier” of optimal portfolios, offering the maximum possible expected return for a given level of risk. It suggests that it is not enough to look at the expected risk and return of one particular stock. By investing in more than one stock, an investor can reap the benefits of diversification, particularly a reduction in the riskiness of the portfolio. MPT quantifies the benefits of diversification, also known as not putting all of your eggs in one basket.”

It’s very easy to run a few lines of Python to download data for a single stock, calculate the mean daily return and daily standard deviation of returns, and then just annualise them to get mean expected annual return and volatility of that single stock.

Let’s choose Apple as an example stock.

We can do this as follows:

import pandas_datareader.data as web
import numpy as np
import pandas as pd
 
stock = ['AAPL']
 
data = web.DataReader(stock,data_source="yahoo",start='01/01/2010')['Adj Close']
 
returns = data.pct_change()
 
mean_return = returns.mean()
return_stdev = returns.std()
 
annualised_return = round(mean_return * 252,2)
annualised_stdev = round(return_stdev * np.sqrt(252),2)
 
print ('The annualised mean return of stock {} is {}, ' 
        'and the annualised volatility is {}').format(stock[0],annualised_return,annualised_stdev)

which gets us the output:

"The annualised mean return of stock AAPL is 0.24, and the annualised volatility is 0.26"

This simple calculation can obviously be carried out for any stock you have in mind, but not many investor portfolios are made up of just one stock – they are made up of 10s of stocks, 100s of stocks sometimes. The rule regarding diversification is that it’s marginal benefit reduces as the number of stocks increases…in other words, the diversification benefit you get when moving from 1 stock to 2 stocks, is larger than when moving from 2 stocks to 3 stocks and so on (this is obviously just a general rule and the actual diversification benefits of any given stock depends on its correlation with the existing portfolio).

So what we are interested in is not the expected return and volatility (standard deviation) of a collection of individual stocks, but rather we want that information for the portfolio of stocks as a whole. This will capture the benefits of diversification of less than perfect correlation between the stocks in the portfolio.

So let’s now assume that we hold a portfolio of 4 tech stocks, Apple, Microsoft, Amazon and Yahoo…how do we start to calculate the expected return and volatility of that portfolio?

Well we need as our first input, the weights of the stocks in the portfolio – that is, how much of each stock do we hold as a percentage of the entire portfolio holdings.

Let’s say our portfolio is made up of 50% Apple stock, 20% Microsoft stock, 20% Amazon stock and 10% Yahoo. We can calculate the portfolio expected return and volatility as follows:

#list of stocks in portfolio
#NOTE THAT THESE MUST BE ENTERED IN ALPHABETICAL ORDER FOR THE RESULTS TO BE CORRECT!!!
stocks = ['AAPL','AMZN','MSFT','YHOO']
 
#download daily price data for each of the stocks in the portfolio
data = web.DataReader(stocks,data_source='yahoo',start='01/01/2010')['Adj Close']
 
#convert daily stock prices into daily returns
returns = data.pct_change()
 
#calculate mean daily return and covariance of daily returns
mean_daily_returns = returns.mean()
cov_matrix = returns.cov()
 
#set array holding portfolio weights of each stock
weights = np.asarray([0.5,0.2,0.2,0.1])
 
#calculate annualised portfolio return
portfolio_return = round(np.sum(mean_daily_returns * weights) * 252,2)
#calculate annualised portfolio volatility
portfolio_std_dev = round(np.sqrt(np.dot(weights.T,np.dot(cov_matrix, weights))) * np.sqrt(252),2)
 
print('Portfolio expected annualised return is {} and volatility is {}').format(portfolio_return,portfolio_std_dev)

Which gets us:

"Portfolio expected annualised return is 0.23 and volatility is 0.21"

Great so we’re getting there, we now know the expected return and volatility of our current portfolio. But what if we aren’t happy with the level of volatility of our current portfolio and would like to reduce it? What if we are willing to take on more risk in search of a higher expected return? How can we rearrange the weight of each stock in our portfolio to achieve these goals?

Well, we could start by manually changing the weights in the array in our previous code, run the program again and see what the expected return and volatility of that particular set of weights comes out at. But that’s a VERY manual way to proceed….there are technically an infinite number of sets of portfolio weights to test, so obviously that’s not practical.

Luckily we can use Monte Carlo simulation to run 1000s of runs of different randomly generated weights for the individual stocks (obviously making sure the weights sum to 100%) and then calculate the expected return, expected volatility and Sharpe Ratio for each of the randomly generated portfolios.

It’s then very simple (and helpful) to plot these combinations of expected returns and volatilities on a scatter plot – we can even colour the data points based on the Sharpe Ratio of that particular portfolio. So let’s do this:

import numpy as np
import pandas as pd
import pandas_datareader.data as web
import matplotlib.pyplot as plt
 
 
#list of stocks in portfolio
stocks = ['AAPL','AMZN','MSFT','YHOO']
 
#download daily price data for each of the stocks in the portfolio
data = web.DataReader(stocks,data_source='yahoo',start='01/01/2010')['Adj Close']
 
#convert daily stock prices into daily returns
returns = data.pct_change()
 
#calculate mean daily return and covariance of daily returns
mean_daily_returns = returns.mean()
cov_matrix = returns.cov()
 
#set number of runs of random portfolio weights
num_portfolios = 25000
 
#set up array to hold results
results = np.zeros((3,num_portfolios))
 
 
for i in xrange(num_portfolios):
    #select random weights for portfolio holdings
    weights = np.random.random(4)
    #rebalance weights to sum to 1
    weights /= np.sum(weights)
 
    #calculate portfolio return and volatility
    portfolio_return = np.sum(mean_daily_returns * weights) * 252
    portfolio_std_dev = np.sqrt(np.dot(weights.T,np.dot(cov_matrix, weights))) * np.sqrt(252)
 
    #store results in results array
    results[0,i] = portfolio_return
    results[1,i] = portfolio_std_dev
    #store Sharpe Ratio (return / volatility) - risk free rate element excluded for simplicity
    results[2,i] = results[0,i] / results[1,i]
 
#convert results array to Pandas DataFrame
results_frame = pd.DataFrame(results.T,columns=['ret','stdev','sharpe'])
 
#create scatter plot coloured by Sharpe Ratio
plt.scatter(results_frame.stdev,results_frame.ret,c=results_frame.sharpe,cmap='RdYlBu')
plt.colorbar()

From the figure above it’s obvious that changing the weight of each stock in the portfolio can have a dramatic effect on the expected return and level of risk (standard deviation/volatility) the investor is exposed to.

We can see, for example, that if the investor is targeting a return of 18%, that can be achieved by holding a portfolio with a volatility of just under 20%, but some portfolios share that same expected return with a volatility as high as just over 28%. This makes it very clear that we must be very thoughtful when choosing how much weight each of our stocks in our portfolio should carry.

Two portfolios that we may like to highlight as being “special” are 1) the portfolio with the highest Sharpe Ratio (i.e. the highest risk adjusted returns) and 2) The “minimum variance portfolio” which is the portfolio with the lowest volatility.

We can locate these 2 portfolios by making a few changes to our code to store all the random weight arrays used for each run of the Monte Carlo simulation along side the expected return, standard deviation and Sharpe Ratio as we go along, then locate the point in the resulting DataFrame where the Sharpe Ratio is highest for portfolio “1” and where the standard deviation is lowest for portfolio “2”. Then – because we stored them as we went along, we can extract the array of weights that corresponds to each of those two portoflios and bang! there we have all the info we need to construct either of the two portfolios!

The full code is as follows (with the changes/additions highlighted):

import numpy as np
import pandas as pd
import pandas_datareader.data as web
import matplotlib.pyplot as plt
 
 
#list of stocks in portfolio
stocks = ['AAPL','AMZN','MSFT','YHOO']
 
#download daily price data for each of the stocks in the portfolio
data = web.DataReader(stocks,data_source='yahoo',start='01/01/2010')['Adj Close']
 
#convert daily stock prices into daily returns
returns = data.pct_change()
 
#calculate mean daily return and covariance of daily returns
mean_daily_returns = returns.mean()
cov_matrix = returns.cov()
 
#set number of runs of random portfolio weights
num_portfolios = 25000
 
#set up array to hold results
#We have increased the size of the array to hold the weight values for each stock
results = np.zeros((4+len(stocks)-1,num_portfolios))
 
for i in xrange(num_portfolios):
    #select random weights for portfolio holdings
    weights = np.array(np.random.random(4))
    #rebalance weights to sum to 1
    weights /= np.sum(weights)
 
    #calculate portfolio return and volatility
    portfolio_return = np.sum(mean_daily_returns * weights) * 252
    portfolio_std_dev = np.sqrt(np.dot(weights.T,np.dot(cov_matrix, weights))) * np.sqrt(252)
 
    #store results in results array
    results[0,i] = portfolio_return
    results[1,i] = portfolio_std_dev
    #store Sharpe Ratio (return / volatility) - risk free rate element excluded for simplicity
    results[2,i] = results[0,i] / results[1,i]
    #iterate through the weight vector and add data to results array
    for j in range(len(weights)):
        results[j+3,i] = weights[j]
 
#convert results array to Pandas DataFrame
results_frame = pd.DataFrame(results.T,columns=['ret','stdev','sharpe',stocks[0],stocks[1],stocks[2],stocks[3]])
 
#locate position of portfolio with highest Sharpe Ratio
max_sharpe_port = results_frame.iloc[results_frame['sharpe'].idxmax()]
#locate positon of portfolio with minimum standard deviation
min_vol_port = results_frame.iloc[results_frame['stdev'].idxmin()]
 
#create scatter plot coloured by Sharpe Ratio
plt.scatter(results_frame.stdev,results_frame.ret,c=results_frame.sharpe,cmap='RdYlBu')
plt.xlabel('Volatility')
plt.ylabel('Returns')
plt.colorbar()
#plot red star to highlight position of portfolio with highest Sharpe Ratio
plt.scatter(max_sharpe_port[1],max_sharpe_port[0],marker=(5,1,0),color='r',s=1000)
#plot green star to highlight position of minimum variance portfolio
plt.scatter(min_vol_port[1],min_vol_port[0],marker=(5,1,0),color='g',s=1000)

And now, all that’s left to do is pull out the weights of the stocks needed to create the two portfolios highlighted above, and that can be done as follows:

print(max_sharpe_port)

ret 0.255758
stdev 0.220652
sharpe 1.159100
AAPL 0.472286
MSFT 0.387202
AMZN 0.090554
YHOO 0.049959
Name: 7146, dtype: float64

(PLEASE NOTE THESE RESULTS ARE INCORRECT AS THEY WERE RUN PREVIOUS TO CORRECTLY SETTING MY INITIAL STOCK LIST IN ALPHABETICAL ORDER – I SHALL TRY TO FIX WHEN I GET A MOMENT)

and

print(min_vol_port)

ret 0.196974
stdev 0.194356
sharpe 1.013470
AAPL 0.299661
MSFT 0.087340
AMZN 0.433787
YHOO 0.179211
Name: 5792, dtype: float64

Ok, I think I’ll leave it there for now. As always, any comments or questions, please do leave them in the comments section below.

Until next time…

It's only fair to share...Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInEmail this to someonePin on PinterestShare on Reddit
Written by s666