Several months ago, I participated in my first crowd-sourced Data Science competition in the Twin Cities run by Analyze This!. In my previous post, I described the benefits of working through the competition and how much I enjoyed the process. I just completed the second challenge and had another great experience that I wanted to share and (hopefully) encourage others to try these types of practical challenges to build their Data Science/Analytics skills.
In this second challenge, I felt much more comfortable with the actual process of cleaning the data, exploring it and building and testing models. I found that the python tools continue to serve me well. However, I also identified a lot of things that I need to do better in future challenges or projects in order to be more systematic about my process. I am curious if the broader community has tips or tricks they can share related to some of the items I will cover below. I will also highlight a few of the useful python tools I used throughout the process. This post does not include any code but is focused more on the process and python tools for Data Science.