This is the second article in a series describing how to use Google Forms to collect information via simple web forms, read it into a pandas dataframe and analyze it. This article will focus on how to use the data in the dataframe to create complex and powerful data visualizations with seaborn.
Google Forms is a service that allows you to collect information via simple web forms. One of the useful features is that the forms will automatically save your data to a Google Sheet. This article will walk through how to create a form, authenticate using OAuth 2 and read all the responses into a pandas dataframe. Because the initial setup and authentication process is a little time consuming, this article will be the first in a two part series.
In case you missed it, github recently announced that Jupyter notebooks will be natively rendered by github. This useful new feature will make it easier for followers of pbpython to view notebooks through github as well as download them to your local system and follow along.
I have moved over 4 notebooks to github and set up the associated files so that it should be pretty straightforward for anyone to checkout the pbpython repo and work with the notebooks. This will also make it easier for others to follow along and help spot issues and make this collection of tips and tricks even more robust.
This post also contains a couple of helpful links I wanted to pass on and keep record of because I think they are really useful.
Pandas makes it very easy to output a DataFrame to Excel. However, there are limited
options for customizing the output and using Excel’s features to make your output
as useful as it could be. Fortunately, it is easy to use the excellent XlsxWriter
module to customize and enhance the Excel workbooks created by Panda’s
function. This article will describe how to use XlsxWriter and Pandas to make complex,
visually appealing and useful Excel workbooks. As an added bonus, the article will briefly
discuss the use of the new
assign function that has been introduced in pandas 0.16.0.
Several years ago, I developed a very simple program called barnum to generate fake data that could be used to test applications. Over the years, I had forgotten about it. With the recent closing of Google code, I decided to take the opportunity to move the code to github and see if it might be useful to people.
Pandas is excellent at manipulating large amounts of data and summarizing it in multiple text and visual representations. Without much effort, pandas supports output to CSV, Excel, HTML, json and more. Where things get more difficult is if you want to combine multiple pieces of data into one document. For example, if you want to put two DataFrames on one Excel sheet, you need to use the Excel libraries to manually construct your output. It is certainly possible but not simple. This article will describe one method to combine multiple pieces of information into an HTML template and convert it to a standalone PDF document using Jinja templates and WeasyPrint.
The previous pivot table article described how to use the pandas
pivot_table function to
combine and present data in an easy to view manner. This concept is probably
familiar to anyone that has used pivot tables in Excel. However, pandas
has the capability to easily take a cross section of the data and manipulate it.
This cross section capability makes a pandas pivot table really useful for generating custom reports.
This article will give a short example of how to manipulate the data in a pivot table to
create a custom Excel report with a subset of pivot table data.
In the python world, there are multiple options for visualizing your data. Because of this variety, it can be really challenging to figure out which one to use when. This article contains a sample of some of the more popular ones and illustrates how to use them to create a simple bar chart. I will create examples of plotting data with: Pandas, Seaborn, ggplot, Bokeh, pygal and Plotly.
More and more information from local, state and federal governments is being placed on the web. However, a lot of the data is not presented in a way that is easy to download and manipulate. I think it is an important civic duty for us all to be aware of how government money is spent. Having the data in a more accessible format is a first step in that process.
In this article, I’ll use BeautifulSoup to scrape some data from the Minnesota 2014 Capital Budget. Then I’ll load the data into a pandas DataFrame and create a simple plot showing where the money is going.