Several years ago, I developed a very simple program called barnum to generate fake data that could be used to test applications. Over the years, I had forgotten about it. With the recent closing of Google code, I decided to take the opportunity to move the code to github and see if it might be useful to people.
Pandas is excellent at manipulating large amounts of data and summarizing it in multiple text and visual representations. Without much effort, pandas supports output to CSV, Excel, HTML, json and more. Where things get more difficult is if you want to combine multiple pieces of data into one document. For example, if you want to put two DataFrames on one Excel sheet, you need to use the Excel libraries to manually construct your output. It is certainly possible but not simple. This article will describe one method to combine multiple pieces of information into an HTML template and convert it to a standalone PDF document using Jinja templates and WeasyPrint.
The previous pivot table article described how to use the pandas
pivot_table function to
combine and present data in an easy to view manner. This concept is probably
familiar to anyone that has used pivot tables in Excel. However, pandas
has the capability to easily take a cross section of the data and manipulate it.
This cross section capability makes a pandas pivot table really useful for generating custom reports.
This article will give a short example of how to manipulate the data in a pivot table to
create a custom Excel report with a subset of pivot table data.
In the python world, there are multiple options for visualizing your data. Because of this variety, it can be really challenging to figure out which one to use when. This article contains a sample of some of the more popular ones and illustrates how to use them to create a simple bar chart. I will create examples of plotting data with: Pandas, Seaborn, ggplot, Bokeh, pygal and Plotly.
More and more information from local, state and federal governments is being placed on the web. However, a lot of the data is not presented in a way that is easy to download and manipulate. I think it is an important civic duty for us all to be aware of how government money is spent. Having the data in a more accessible format is a first step in that process.
In this article, I’ll use BeautifulSoup to scrape some data from the Minnesota 2014 Capital Budget. Then I’ll load the data into a pandas DataFrame and create a simple plot showing where the money is going.