In early March, I published an article introducing prophet which is an open source library released by Facebook that is used to automate the time series forecasting process. As I promised in this article, I’m going to see how well those predictions held up to the real world after 2.5 months of traffic on this site.
A common business analytics task is trying to forecast the future based on known historical data. Forecasting is a complicated topic and relies on an analyst knowing the ins and outs of the domain as well as knowledge of relatively complex mathematical theories. Because the mathematical concepts can be complex, a lot of business forecasting approaches are “solved” with a little linear regression and “intuition.” More complex models would yield better results but are too difficult to implement.
Given that background, I was very interested to see that Facebook recently open sourced a python and R library called prophet which seeks to automate the forecasting process in a more sophisticated but easily tune-able model. In this article, I’ll introduce prophet and show how to use it to predict the volume of traffic in the next year for Practical Business Python. To make this a little more interesting, I will post the prediction through the end of March so we can take a look at how accurate the forecast is.
Despite being over 1 year old, one of the most popular articles I have written is Overview of Python Visualization Tools. After these many months, it is one of my most frequently searched for, linked to and read article on this site. I think this fact speaks to hunger in the python community for one visualization tool to rise above the rest. I am not sure I want (or need) one to “win” but I do continue to watch the changes in this space with interest.
All of the tools I mentioned in the original article are still alive and many have changed quite a bit over the past year or so. Anyone looking for a visualization tool should investigate the options and see which ones meet their needs. They all have something to offer and different use-cases will drive different solutions.
In the spirit of keeping up with the latest options in this space, I recently heard about Altair which calls itself a “declarative statistical visualization library for Python.” One of the things that peaked my interest was that it is developed by Brian Granger and Jake Vanderplas. Brian is a core developer in the IPython project and very active in the scientific python community. Jake is also active in the scientific python community and has written a soon to be released O’Reilly book called Python Data Science Handbook. Both of these individuals are extremely accomplished and knowledgeable about python and the various tools in the python scientific ecosystem. Because of their backgrounds, I was very curious to see how they approached this problem.
In the python world, there are multiple options for visualizing your data. Because of this variety, it can be really challenging to figure out which one to use when. This article contains a sample of some of the more popular ones and illustrates how to use them to create a simple bar chart. I will create examples of plotting data with: Pandas, Seaborn, ggplot, Bokeh, pygal and Plotly.
More and more information from local, state and federal governments is being placed on the web. However, a lot of the data is not presented in a way that is easy to download and manipulate. I think it is an important civic duty for us all to be aware of how government money is spent. Having the data in a more accessible format is a first step in that process.
In this article, I’ll use BeautifulSoup to scrape some data from the Minnesota 2014 Capital Budget. Then I’ll load the data into a pandas DataFrame and create a simple plot showing where the money is going.