Search this site
Embedded Files
AIMD GPDS Courses
  • Home
  • Courses
  • Contact
AIMD GPDS Courses
  • Home
  • Courses
  • Contact
  • More
    • Home
    • Courses
    • Contact

日本語  ❯

Lesson 3    ❮    Lesson List    ❮    Top Page

3.1  Arithmetic Operations

3.2  Handling Missing Data

❯  3.3  Discretization

3.4  Statistics

3.5  Filtering

⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺
EXPECTED COMPLETION TIME
❲▹❳  Video   8m 4s
☷  Interactive readings   5m

Performing Basic Binning

Suppose you have data about a group of people in a study, and you want to group them into discrete age buckets.

Let's divide these into bins of 18 to 25, 26 to 35, 36 to 60, and finally 60 and older. To do so, you have to use cut, a function in pandas.

Categorizing Bins

This Categorical object contains a categories array specifying the distinct category names along with a labeling for the ages data in the codes attributes.

We can also replace the label using reset_index and set_index.

Making bins with Equal Range

If you pass an integer number of bins to cut instead of explicit bin edges, it will compute equal-length bins based on the minimum and maximum values in the data. 

Binning using Quartile

A closely related function, qcut, bins the data based on sample quantiles. Since qcut uses sample quantiles instead, by definition you will obtain roughly equal-size bins.

Similar to cut, you can pass your own quantiles.

©2023. All rights reserved.  Samy Baladram,
Graduate Program in Data Science - GSIS - Tohoku University
Google Sites
Report abuse
Page details
Page updated
Google Sites
Report abuse