Binning Data with Pandas qcut and cut
Posted by Chris Moffitt in articles
When dealing with continuous numeric data, it is often helpful to bin the data into
multiple buckets for further analysis. There are several different terms for binning
including bucketing, discrete binning, discretization or quantization. Pandas supports
these approaches using the cut
and qcut
functions.
This article will briefly describe why you may want to bin your data and how to use the pandas
functions to convert continuous data to a set of discrete buckets. Like many pandas functions,
cut
and qcut
may seem simple but there is a lot of capability packed into
those functions. Even for more experience users, I think you will learn a couple of tricks
that will be useful for your own analysis.