Welcome to the first newsletter of 2021. I hope you are all doing well!
Around the site
In the past couple months, I have written a couple of new articles, including two new case studies. The main goal with these case studies is to highlight real-world Python usage and inspire you to apply these concepts to your own problems.
- Case Study: Processing Historical Weather Pattern Data describes how Michael Biermann developed several Jupyter notebooks to download and parse historical weather data. Even if you are not parsing German weather data, there are a lot of useful concepts in this case study that are broadly applicable.
- Reading Poorly Structured Excel Files in Pandas shows examples of how to parse specific sections of data out of an Excel file with data scattered across worksheets. Ideally we would all like to have data cleanly formatted but sometimes that’s not the case.
- Comprehensive Guide to Grouping and Aggregating with Pandas dives deeper into all of the options for grouping and aggregating your data in pandas. In my experience, I forget how much I can do using the pre-built and custom functions in pandas. Feel free to leave a comment if you have other tips that you use when grouping and aggregating data.
- Pandas DataFrame Visualization Tools is a survey of some of the options available for using graphical tools to explore DataFrames. I have received a lot of good feedback on this article and will likely make some updates in the future. If you have other tools you use, let me know and I will try to include in future updates.
- Case Study: Automating Excel File Creation and Distribution with Pandas and Outlook is another good example of manipulating multiple files. This case study from Mark Doll has some overlap with Michael’s but also includes some examples of distributing results via Outlook.
Other useful news
- Sebastian Raschka is an author and Assistant Professor of Statistics. He has made the content for his Intro to Machine Learning Course available. He also has an intro class covering NumPy and Matplotlib.
- A lot of Top 10 articles are click-baity but this one actually had some interesting libraries I was not familiar with.
- I have written about Plotly Express before. This article is complete and gives some really good examples for diving deeper into this library. I apologize for linking to Medium but it is a good article.
- Jeff Hale has a nice summary of some of the newest features in scikit-learn.
- Numpy 1.20 just dropped with lots of updates.