This page contains a collection of books and other content that I have found useful and hope you will as well. Any of the links which point to Amazon are affiliate links which mean this site will receive a small referral commission for any purchases through these links. The rest of the content is freely available and very useful.
The Python Data Science Handbook by Jake VanderPlas is an excellent overview of the core elements of the python data science toolkit. This book is like 5 books in one with great coverage of IPython, NumPy, pandas, matplotlib and machine learning with scikit-learn. I highly recommend this book to anyone that has basic python experience and is planning to work with any of the tools it covers. All of the content has been generously available as notebooks so you can review the content before purchasing the book.
Data Science for Business by Foster Provost and Tom Fawcett is a very useful book for thinking about Data Science for solving business problems. The book does not cover any specific language and is light on math but very heavy on the fundamental concepts of Data Science and how to implement them in real life. This is a useful resource when it comes to figuring out how to apply technology in a complicated business setting.
Effective Pandas by Tom Augspurger is a short book that is a collection of several of his blog posts. It does a fabulous job of describing idiomatic pandas code. This book is best for someone that has basic python understanding and exposure to pandas. I continually come back to the content and refer to it in order to find out new and more efficient ways to use pandas. All of the content is available on github but please consider purchasing the book if you find it useful.
A Whirlwind Tour of Python by Jake VanderPlas is a quick but insightful introduction to python that is available for free. It focuses on basic and essential python syntax and hopes “readers will walk away with a solid foundation from which to explore the data science stack” If you have limited experience working with python, this is a good place to get started.
I am a sucker for cheat sheets. I find them useful once I have the basic syntax of a package down and just need a quick refresher. Here are a few of the ones I refer back to frequently.
The official pandas cheat sheet is a nice summary of data wrangling functions in pandas. It does not cover everything pandas can do but it is a good reminder of the core concepts.
Over at the Mark Graph blog, there is a really detailed 12 page pandas dataframe cheat sheet that is worth checking out. It goes into a lot more detail than the official pandas cheatsheet but is most useful to someone that has basic familiarity with pandas.
The Mark Graph blog also has a nice matplotlib cheatsheet that’s worth adding to your library.