Welcome to Part 2 of the series of posts dealing with how to build your own python based personal portfolio /wealth simulation model. At the end of the first post (which can be found here), we got to the point where we had modelled some inflows, some outflows, we had applied an annual salary raise to our future income flows, along with applying various tax rates to both our active income (salary) and investment income.
We had also factored in a stochastic element to our model for generating investment returns, using the historic mean monthly return and volatility of the S&P Total Return index as a proxy for the “market”. Finally, we had ended by adding a couple of lines of code that would record whether our wealth/asset value were reduced to (or below) zero at any point throughout the simulated time period; we will be using this later to help calculate our “risk of ruin”, i.e. attach probabilities to the likelihood of ending up with assets worth less than a threshold value (one that we deem unacceptable to fall below – which doesn’t have to be zero of course).
The full code snippet from the end of Part 1 is shown below for convenience (I have altered some of the outflow and inflow values back to their original levels just fyi). IMPORTANT: I have also had to include the use of the yfinance package to allow the over-riding of the Pandas DataReader package as at the time of writing, Pandas DataReader is currently not stable and essentially not working. The code changes are commented below to help identify them more easily.
import pandas as pd import numpy as np import random # change syntax and name of Pandas DataReader import from pandas_datareader import data as pdr # import yfinance as pandas data-reader currently not stable import yfinance as yf import datetime import matplotlib as mpl import matplotlib.pyplot as plt %matplotlib inline # set the numpy random seed to allow replication of results np.random.seed(seed=7) start, end = datetime.datetime(2000, 12, 31), datetime.datetime(2020, 1,1) # add yfinance-pandas data-reader overide (now we can use pandas data-reader methods again as currently unstable) yf.pdr_override() # change our download to use new over-ridden method sp = pdr.get_data_yahoo("^SP500TR", start=start, end=end) sp_monthly_pct_return = sp.resample('M').last().pct_change().mean().values[0] sp_monthly_std_dev = sp.resample('M').last().pct_change().std().values[0] inflows = {'active_annual_income':50_000, 'starting_assets': 250_000} outflows = {'rent':1500, 'credit_card_payment':750, 'medical_insurance':250, 'pension_contribution':500, 'misc': 1500} variables = {'start_date' : "01/01/2020", 'years': 10, 'tax_on_active_income_gains': 0.25, 'avg_ann_income_raise':0.05, 'avg_ann_inflation': 0.02, 'tax_on_investment_gains': 0.35, 'avg_monthly_market_returns': sp_monthly_pct_return, 'avg_monthly_market_volatility': sp_monthly_std_dev} income_gains_storage = [] investment_gains_storage = [] assets_starting_list = [inflows['starting_assets']] assets_ending_list = [] months = variables['years'] * 12 ruined = False for month in range(months): if assets_ending_list: assets_starting_list.append(assets_ending_list[-1]) assets = assets_starting_list[-1] assets -= (outflows['rent'] + outflows['credit_card_payment'] + \ outflows['medical_insurance'] + outflows['pension_contribution'] + \ outflows['misc']) if assets <= 0: inv_gain = 0 ruined = True break market_return = np.random.normal(variables['avg_monthly_market_returns'], variables['avg_monthly_market_volatility'], 1)[0] investment_return = (assets * market_return) * (1 - variables['tax_on_investment_gains']) investment_gains_storage.append(investment_return) assets += investment_return income = (inflows['active_annual_income'] * (1 - variables['tax_on_active_income_gains'])) / 12 income_gains_storage.append(income) if (month % 12 == 0): inflows['active_annual_income'] *= (1 + (variables['avg_ann_income_raise'])) outflows['rent'] *= (1 + (variables['avg_ann_inflation'])) outflows['credit_card_payment'] *= (1 + (variables['avg_ann_inflation'])) outflows['medical_insurance'] *= (1 + (variables['avg_ann_inflation'])) outflows['pension_contribution'] *= (1 + (variables['avg_ann_inflation'])) outflows['misc'] *= (1 + (variables['avg_ann_inflation'])) assets += income assets_ending = assets assets_ending_list.append(assets_ending) plt.plot(pd.Series(assets_ending_list)) plt.xlabel('Month') plt.ylabel('Ending Asset Value') plt.show()

The next element we are going to incorporate is one that allows us to set a future date at which we plan to retire and effectively cease working, with the accompanying expectation that our active income (i.e. salary) will fall dramatically, if not disappear altogether. Retirement is an important stage of life to plan for, for many reasons; not only would our inflows from salary be affected, it is likely many of our outflows would change too. By retirement age, it is not uncommon for people to have paid off any outstanding mortgage debt on their homes. This means no more monthly mortgage outflows, and offers other opportunities such as equity release schemes, or “downsizing” of property etc.
Of utmost importance is to ensure that by the time you retire, you have built an asset base large enough to sustain your needs and generate enough investment income to cover your living costs, for however many years you end up living in retirement.
With this in mind let us start by adding a planned retirement date (entered as years from today until retirement; I have chosen 25 years in the future as that would place me around the right age). We shall also extend the simulation period from 10 years to 40. I have also added a couple of lines to check whether we are before or after our planned retirement date and adjust our monthly income (i.e. salary inflow) accordingly.
np.random.seed(seed=7) start, end = datetime.datetime(2000, 12, 31), datetime.datetime(2020, 1,1) yf.pdr_override() sp = pdr.get_data_yahoo("^SP500TR", start=start, end=end) sp_monthly_pct_return = sp.resample('M').last().pct_change().mean().values[0] sp_monthly_std_dev = sp.resample('M').last().pct_change().std().values[0] inflows = {'active_annual_income':50_000, 'starting_assets': 250_000} outflows = {'rent':1500, 'credit_card_payment':750, 'medical_insurance':250, 'pension_contribution':500, 'misc': 1500} variables = {'start_date' : "01/01/2020", 'years': 40, 'retirement_year': 25, 'tax_on_active_income_gains': 0.25, 'avg_ann_income_raise':0.05, 'avg_ann_inflation': 0.02, 'tax_on_investment_gains': 0.35, 'avg_monthly_market_returns': sp_monthly_pct_return, 'avg_monthly_market_volatility': sp_monthly_std_dev} income_gains_storage = [] investment_gains_storage = [] assets_starting_list = [inflows['starting_assets']] assets_ending_list = [] months = variables['years'] * 12 ruined = False for month in range(months): if assets_ending_list: assets_starting_list.append(assets_ending_list[-1]) assets = assets_starting_list[-1] assets -= (outflows['rent'] + outflows['credit_card_payment'] + \ outflows['medical_insurance'] + outflows['pension_contribution'] + \ outflows['misc']) if assets <= 0: inv_gain = 0 ruined = True break market_return = np.random.normal(variables['avg_monthly_market_returns'], variables['avg_monthly_market_volatility'], 1)[0] investment_return = (assets * market_return) * (1 - variables['tax_on_investment_gains']) investment_gains_storage.append(investment_return) assets += investment_return # add logic to account for zero salary income once retired if month >= variables['retirement_year'] * 12: income = 0 else: income = (inflows['active_annual_income'] * (1 - variables['tax_on_active_income_gains'])) / 12 income_gains_storage.append(income) if (month % 12 == 0): inflows['active_annual_income'] *= (1 + (variables['avg_ann_income_raise'])) outflows['rent'] *= (1 + (variables['avg_ann_inflation'])) outflows['credit_card_payment'] *= (1 + (variables['avg_ann_inflation'])) outflows['medical_insurance'] *= (1 + (variables['avg_ann_inflation'])) outflows['pension_contribution'] *= (1 + (variables['avg_ann_inflation'])) outflows['misc'] *= (1 + (variables['avg_ann_inflation'])) assets += income assets_ending = assets assets_ending_list.append(assets_ending) plt.plot(pd.Series(assets_ending_list)) plt.xlabel('Month') plt.ylabel('Ending Asset Value') plt.show()

Now we can clearly see the effect our drop in salary after retirement has on our ending asset values through time. As is perfectly logical, with no salary coming in the tendency is for our personal wealth and asset base to fall over time. Firstly, we still need to account for the fact that both our post-retirement inflows and outflows are usually fundamentally different from the “normal”, non-retirement situation, and we shall come to that in just a second.
Let’s assume post-retirement we no longer pay “rent” or mortgage payments, so the current rent outflow of 1,500 no longer applies once we hit our retirement date. We could also imagine our health insurance costs and general medical expenses would rise significantly on average, and we might start to receive some pension income each month. The exact situation will vary for each individual, so for now I shall apply the above logic and go from there.
np.random.seed(seed=7) start, end = datetime.datetime(2000, 12, 31), datetime.datetime(2020, 1,1) yf.pdr_override() sp = pdr.get_data_yahoo("^SP500TR", start=start, end=end) sp_monthly_pct_return = sp.resample('M').last().pct_change().mean().values[0] sp_monthly_std_dev = sp.resample('M').last().pct_change().std().values[0] inflows = {'active_annual_income':50_000, 'starting_assets': 250_000, 'monthly_pension': 1500} # add estimated post-retirement monthly pension outflows = {'rent':1500, 'credit_card_payment':750, 'medical_insurance':250, 'pension_contribution':500, 'misc': 1500, 'retirement_medical_expenses':850, # add post-retirement monthly medical expenses 'retirement_misc': 2000} # add post-retirement misc costs variables = {'start_date' : "01/01/2020", 'years': 40, 'retirement_year': 25, 'tax_on_active_income_gains': 0.25, 'avg_ann_income_raise':0.05, 'avg_ann_inflation': 0.02, 'tax_on_investment_gains': 0.35, 'avg_monthly_market_returns': sp_monthly_pct_return, 'avg_monthly_market_volatility': sp_monthly_std_dev} income_gains_storage = [] investment_gains_storage = [] assets_starting_list = [inflows['starting_assets']] assets_ending_list = [] months = variables['years'] * 12 ruined = False for month in range(months): if assets_ending_list: assets_starting_list.append(assets_ending_list[-1]) assets = assets_starting_list[-1] # add logic to account for different outflows in retirement if month >= variables['retirement_year'] * 12: outflow = outflows['retirement_medical_expenses'] + outflows['retirement_misc'] else: outflow = (outflows['rent'] + outflows['credit_card_payment'] + \ outflows['medical_insurance'] + outflows['pension_contribution'] + \ outflows['misc']) assets -= outflow if assets <= 0: inv_gain = 0 ruined = True break market_return = np.random.normal(variables['avg_monthly_market_returns'], variables['avg_monthly_market_volatility'], 1)[0] investment_return = (assets * market_return) * (1 - variables['tax_on_investment_gains']) investment_gains_storage.append(investment_return) assets += investment_return # add logic to account for pension received in retirement with no salary if month >= variables['retirement_year'] * 12: income = inflows['monthly_pension'] else: income = (inflows['active_annual_income'] * (1 - variables['tax_on_active_income_gains'])) / 12 income_gains_storage.append(income) if (month % 12 == 0): inflows['active_annual_income'] *= (1 + (variables['avg_ann_income_raise'])) outflows['rent'] *= (1 + (variables['avg_ann_inflation'])) outflows['credit_card_payment'] *= (1 + (variables['avg_ann_inflation'])) outflows['medical_insurance'] *= (1 + (variables['avg_ann_inflation'])) outflows['pension_contribution'] *= (1 + (variables['avg_ann_inflation'])) outflows['misc'] *= (1 + (variables['avg_ann_inflation'])) assets += income assets_ending = assets assets_ending_list.append(assets_ending) plt.plot(pd.Series(assets_ending_list)) plt.xlabel('Month') plt.ylabel('Ending Asset Value') plt.show()

Ok the above code implements the logic we have discussed thus far. It is, however, getting somewhat messy and could really do with being cleaned up. There are a few parts of the script that could be wrapped up as functions – the two parts that spring to mind are those for calculating our inflows and outflows each period. Let’s start with the outflows:
def calculate_outflows(month, outflows, variables): # logic to account for different outflows in retirement if month >= variables['retirement_year'] * 12: outflow = outflows['retirement_medical_expenses'] + outflows['retirement_misc'] else: # logic to account for different outflows before retirement outflow = (outflows['rent'] + outflows['credit_card_payment'] + \ outflows['medical_insurance'] + outflows['pension_contribution'] + \ outflows['misc']) # each year, increment the values according to the annual inflation input if (month % 12 == 0) and (month > 0): outflows['rent'] *= (1 + (variables['avg_ann_inflation'])) outflows['credit_card_payment'] *= (1 + (variables['avg_ann_inflation'])) outflows['medical_insurance'] *= (1 + (variables['avg_ann_inflation'])) outflows['pension_contribution'] *= (1 + (variables['avg_ann_inflation'])) outflows['misc'] *= (1 + (variables['avg_ann_inflation'])) return outflow
And now our income (not including our investment income – we will create a separate function for that).
def calculate_income(month, inflows, variables): # logic to account for different inflows in retirement if month >= variables['retirement_year'] * 12: income = inflows['monthly_pension'] else: # logic to account for different inflows before retirement income = (inflows['active_annual_income'] * (1 - variables['tax_on_active_income_gains'])) / 12 if (month % 12 == 0) and (month > 0): inflows['active_annual_income'] *= (1 + (variables['avg_ann_income_raise'])) return income
And now to incorporate it into our main script:
np.random.seed(seed=7) start, end = datetime.datetime(2000, 12, 31), datetime.datetime(2020, 1,1) yf.pdr_override() sp = pdr.get_data_yahoo("^SP500TR", start=start, end=end) sp_monthly_pct_return = sp.resample('M').last().pct_change().mean().values[0] sp_monthly_std_dev = sp.resample('M').last().pct_change().std().values[0] inflows = {'active_annual_income':50_000, 'starting_assets': 250_000, 'monthly_pension': 1500} outflows = {'rent':1500, 'credit_card_payment':750, 'medical_insurance':250, 'pension_contribution':500, 'misc': 1500, 'retirement_medical_expenses':850, 'retirement_misc': 2000} variables = {'start_date' : "01/01/2020", 'years': 40, 'retirement_year': 25, 'tax_on_active_income_gains': 0.25, 'avg_ann_income_raise':0.05, 'avg_ann_inflation': 0.02, 'tax_on_investment_gains': 0.35, 'avg_monthly_market_returns': sp_monthly_pct_return, 'avg_monthly_market_volatility': sp_monthly_std_dev} income_gains_storage = [] investment_gains_storage = [] assets_starting_list = [inflows['starting_assets']] assets_ending_list = [] months = variables['years'] * 12 ruined = False for month in range(months): if assets_ending_list: assets_starting_list.append(assets_ending_list[-1]) assets = assets_starting_list[-1] # calculate outflows using created function outflow = calculate_outflows(month, outflows, variables) assets -= outflow if assets <= 0: inv_gain = 0 ruined = True break market_return = np.random.normal(variables['avg_monthly_market_returns'], variables['avg_monthly_market_volatility'], 1)[0] investment_return = (assets * market_return) * (1 - variables['tax_on_investment_gains']) investment_gains_storage.append(investment_return) assets += investment_return # calculate outflows using created function income = calculate_income(month, inflows, variables) income_gains_storage.append(income) # remove the code that was previously here regarding the annual incrementing of values assets += income assets_ending = assets assets_ending_list.append(assets_ending) plt.plot(pd.Series(assets_ending_list)) plt.xlabel('Month') plt.ylabel('Ending Asset Value') plt.show()

We can see from the plot above that the results are the same as those we generated before we added the use of functions, which bodes well and suggests the functions have been correctly structured and applied. We might also want to take the code used to calculate the monthly investment income and warp that into a function. This will allow us to more easily add complexity to the logic if necessary.
def calculate_investment_gains(assets, variables): if assets <= 0: inv_gains = 0 else: market_return = np.random.normal(variables['avg_monthly_market_returns'], variables['avg_monthly_market_volatility'], 1)[0] inv_gains = assets * market_return return inv_gains
And now to add the new investment gains function to our script:
np.random.seed(seed=7) start, end = datetime.datetime(2000, 12, 31), datetime.datetime(2020, 1,1) yf.pdr_override() sp = pdr.get_data_yahoo("^SP500TR", start=start, end=end) sp_monthly_pct_return = sp.resample('M').last().pct_change().mean().values[0] sp_monthly_std_dev = sp.resample('M').last().pct_change().std().values[0] inflows = {'active_annual_income':50_000, 'starting_assets': 250_000, 'monthly_pension': 1500} outflows = {'rent':1500, 'credit_card_payment':750, 'medical_insurance':250, 'pension_contribution':500, 'misc': 1500, 'retirement_medical_expenses':850, 'retirement_misc': 2000} variables = {'start_date' : "01/01/2020", 'years': 40, 'retirement_year': 25, 'tax_on_active_income_gains': 0.25, 'avg_ann_income_raise':0.05, 'avg_ann_inflation': 0.02, 'tax_on_investment_gains': 0.35, 'avg_monthly_market_returns': sp_monthly_pct_return, 'avg_monthly_market_volatility': sp_monthly_std_dev} income_gains_storage = [] investment_gains_storage = [] assets_starting_list = [inflows['starting_assets']] assets_ending_list = [] months = variables['years'] * 12 ruined = False for month in range(months): if assets_ending_list: assets_starting_list.append(assets_ending_list[-1]) assets = assets_starting_list[-1] outflow = calculate_outflows(month, outflows, variables) assets -= outflow # altered the ``if`` code block below to account for use of new function if assets <= 0: ruined = True break # use our new function to calculate investment returns investment_return = calculate_investment_gains(assets, variables) investment_gains_storage.append(investment_return) assets += investment_return income = calculate_income(month, inflows, variables) income_gains_storage.append(income) assets += income assets_ending = assets assets_ending_list.append(assets_ending) plt.plot(pd.Series(assets_ending_list)) plt.xlabel('Month') plt.ylabel('Ending Asset Value') plt.show()

Just FYI the way we have moved the calculation of monthly returns inside a function and haven’t included the setting of the numpy seed value within that function means that now, each time we run it, we will get slightly different results. I didn’t include the setting of the random seed as if I had, each month’s market return output would have been exactly the same which is of course not what we want in this case.
Now that we have cleaned the code up a bit and started to refactor some of the logic into separate functions I am going to leave it here for this pos. There is still plenty to do before we are finished and we will pick it back up in Part 3.