Mon 19 December 2016

Building a Financial Model with Pandas - Version 2

Introduction

In my last article, I discussed building a financial model in pandas that could be used for multiple amortization scenarios. Unfortunately, I realized that I made a mistake in that approach so I had to rethink how to solve the problem. Thanks to the help of several individuals, I have a new solution that resolves the issues and produces the correct results.

In addition to posting the updated solution, I have taken this article as an opportunity to take a step back and examine what I should have done differently in approaching the original problem. While it is never fun to make a mistake in front of thousands of people, I’ll try to swallow my pride and learn from it.

What was the problem?

If you have not read the last article, it would be helpful to review it in order to get familiar with the problem I was trying to solve. As you can see in the comments, my solution was not correct because I failed to re-calculate the schedule based on the principal reduction from additional principal payments. Basically, I need to recalculate the values each period - which looks like a looping process. Unfortunately, this was what I was explicitly trying to avoid.

What is the solution?

Based on comments from several knowledgeable readers, I think the best solution is to use a generator to build up the table and return it in a format that can be easily converted to a pandas dataframe. The benefit of the generator is that it gracefully handles the situation where the loan is paid off early due to additional principal payments. If you are not familiar with python generators and their usage, this post is helpful.

The first step in the updated solution is to build the amortize function which effectively loops through each period and returns an OrderedDict which can be easily converted into a pandas dataframe.

import pandas as pd
from datetime import date
import numpy as np
from collections import OrderedDict
from dateutil.relativedelta import *


def amortize(principal, interest_rate, years, addl_principal=0, annual_payments=12, start_date=date.today()):

    pmt = -round(np.pmt(interest_rate/annual_payments, years*annual_payments, principal), 2)
    # initialize the variables to keep track of the periods and running balances
    p = 1
    beg_balance = principal
    end_balance = principal

    while end_balance > 0:

        # Recalculate the interest based on the current balance
        interest = round(((interest_rate/annual_payments) * beg_balance), 2)

        # Determine payment based on whether or not this period will pay off the loan
        pmt = min(pmt, beg_balance + interest)
        principal = pmt - interest

        # Ensure additional payment gets adjusted if the loan is being paid off
        addl_principal = min(addl_principal, beg_balance - principal)
        end_balance = beg_balance - (principal + addl_principal)

        yield OrderedDict([('Month',start_date),
                           ('Period', p),
                           ('Begin Balance', beg_balance),
                           ('Payment', pmt),
                           ('Principal', principal),
                           ('Interest', interest),
                           ('Additional_Payment', addl_principal),
                           ('End Balance', end_balance)])

        # Increment the counter, balance and date
        p += 1
        start_date += relativedelta(months=1)
        beg_balance = end_balance

Once this function is defined, building out a dataframe containing the full schedule for the results is straightforward:

schedule = pd.DataFrame(amortize(700000, .04, 30, addl_principal=200, start_date=date(2016, 1,1)))
schedule.head()

	Period	Month	Begin Balance	Payment	Interest	Principal	Additional_Payment	End Balance
0	1	2016-01-01	700000.00	3341.91	2333.33	1008.58	200.0	698791.42
1	2	2016-02-01	698791.42	3341.91	2329.30	1012.61	200.0	697578.81
2	3	2016-03-01	697578.81	3341.91	2325.26	1016.65	200.0	696362.16
3	4	2016-04-01	696362.16	3341.91	2321.21	1020.70	200.0	695141.46
4	5	2016-05-01	695141.46	3341.91	2317.14	1024.77	200.0	693916.69

schedule.tail()

	Period	Month	Begin Balance	Payment	Interest	Principal	Additional_Payment	End Balance
319	320	2042-08-01	14413.65	3341.91	48.05	3293.86	200.0	10919.79
320	321	2042-09-01	10919.79	3341.91	36.40	3305.51	200.0	7414.28
321	322	2042-10-01	7414.28	3341.91	24.71	3317.20	200.0	3897.08
322	323	2042-11-01	3897.08	3341.91	12.99	3328.92	200.0	368.16
323	324	2042-12-01	368.16	369.39	1.23	368.16	0.0	0.00

The nice aspect of this solution is that the generator approach builds up the results in an incremental manner so that you do not have to try to determine how many iterations you need in advance. Essentially, the code keeps calculating the end_balance each period until it gets to 0 and the generator is complete.

Example Analysis

I have built out a variation on this solution that also includes summary statistics on the scenarios so that you can easily see things like:

How many payments will you make?
When will the balance be paid off?
How much in interest do you pay over the life of the loan?

This notebook contains the full working code. Here are a few example to show you how it works and can be a handy solution for modeling various scenarios:

schedule1, stats1 = amortization_table(100000, .04, 30, addl_principal=50, start_date=date(2016,1,1))
schedule2, stats2 = amortization_table(100000, .05, 30, addl_principal=200, start_date=date(2016,1,1))
schedule3, stats3 = amortization_table(100000, .04, 15, addl_principal=0, start_date=date(2016,1,1))

pd.DataFrame([stats1, stats2, stats3])

	Payoff Date	Num Payments	Interest Rate	Years	Principal	Payment	Additional Payment	Total Interest
0	2041-01-01	301	0.04	30	100000	477.42	50	58441.08
1	2032-09-01	201	0.05	30	100000	536.82	200	47708.38
2	2030-12-01	180	0.04	15	100000	739.69	0	33143.79

You could also build out some simple scenarios and visualize the alternative outcomes:

additional_payments = [0, 50, 200, 500]
fig, ax = plt.subplots(1, 1)

for pmt in additional_payments:
    result, _ = amortization_table(100000, .04, 30, addl_principal=pmt, start_date=date(2016,1,1))
    ax.plot(result['Month'], result['End Balance'], label='Addl Payment = ${}'.format(str(pmt)))
plt.title("Pay Off Timelines")
plt.ylabel("Balance")
ax.legend();

Lessons Learned

I will admit that it is embarrassing to put out a “solution” to a problem and realize pretty quickly (due to feedback) that it was wrong. In the interest of continuous improvement, here are some lessons I learned:

Understand the problem

I made the mistake of thinking I knew how the pre-payment process worked but I was obviously mistaken. If I spent a little more time building up a prototype in Excel and validating the results ahead of time, I would have caught my errors much earlier in the process.
Don’t fixate on a predefined solution approach

I decided that I wanted to make the solution in pure-pandas without any looping. In reality, I should have thought about the entire problem and all of the options available in the python ecosystem - including the standard lib.
Look at the standard lib

While pandas has a lot of great tools, the python standard library is really rich and provides many capabilities that can solve a wide variety of problems.
The python community is great

So many places on the Internet can be hostile. However, I am very impressed with how many people publicly and privately offered their support to help me fix the issue. Everyone that reached out to me was doing it in the spirit of trying to help me understand the problem and build a better solution. I appreciate their patience and willingness to work with me on finding a better approach. Several people spent much of their own time looking at my proposed fix and offering their ideas on how to improve.
Sometimes the best way to learn is to try and fail

I went into this article series trying to learn more about using pandas. However, I actually had a chance to learn and use generators for a real life problem. As a result I do understand python generators much more and understand why they are a good solution to this type of problem. I also spent some time pondering how to use python’s min and max functions to simplify some of my code.

Even with a bit of a stumble in this process, it has been a good learning experience and I hope it will be for many of you as well.

Practical Business Python

Building a Financial Model with Pandas - Version 2

Introduction

What was the problem?

What is the solution?

Example Analysis

Lessons Learned

Comments

Subscribe to the mailing list

Social

Submit a Topic

Popular

Article Roadmap

Feeds

Disclosure