This guest post that walks through a great example of using python to automate a report generating process. I think PB Python readers will enjoy learning from this real world example using python, jupyter notebooks, papermill and several other tools.
Several years ago, I wrote an article about using pandas to creating a diff of two excel files. Ovet the years, the pandas API has changed and the diff script no longer works with the latest pandas releases. Through the magic of search engines, people are still discovering the article and are asking for help in getting it to work with more recent versions of pandas. Since pandas is closing in on a 1.0 release, I think this is a good time to get an updated version out there.
Python has many options for natively creating common Microsoft Office file types including Excel, Word and PowerPoint. In some cases, however, it may be too difficult to use the pure python approach to solve a problem. Fortunately, python has the “Python for Windows Extensions” package known as pywin32 that allows us to easily access Window’s Component Object Model (COM) and control Microsoft applications via python. This article will cover some basic use cases for this type of automation and how to get up and running with some useful scripts.
Whenever I am doing analysis with pandas my first goal is to get data into
a panda’s DataFrame using one of the many available options. For the vast
majority of instances, I use
However, there are instances when I just have a few lines of data or some calculations
that I want to include in my analysis. In these cases it is helpful to
know how to create DataFrames from standard python data structures such as lists or
dictionaries. The basic process is not difficult but because there are several different
options it is helpful to understand how each works. I can never remember whether
I should use
from_items or the
DataFrame constructor. Normally, through some trial and error,
I figure it out. Since it is still confusing to me, I thought I would walk through
several examples below to clarify the different approaches. At the end of the article,
I briefly show how this can be useful when generating Excel reports.