Search this site
Embedded Files
AIMD GPDS Courses
  • Home
  • Courses
  • Contact
AIMD GPDS Courses
  • Home
  • Courses
  • Contact
  • More
    • Home
    • Courses
    • Contact

日本語  ❯

Lesson 3    ❮    Lesson List    ❮    Top Page

3.1  Arithmetic Operations

3.2  Handling Missing Data

3.3  Discretization

❯  3.4  Statistics

3.5  Filtering

⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺
EXPECTED COMPLETION TIME
❲▹❳  Video   4m 7s
☷  Interactive readings   5m

Computing Aggregate Statistics

Here, we generate some normally distributed random data and compute some aggregate statistics:

Computing Statistics on  Rows & Columns

Functions like mean and sum take an optional axis argument that computes the statistic over the given axis, resulting in an array with one fewer dimension.

Performing Non-Aggregate Statistics

Other methods like cumsum and cumprod do not aggregate, instead producing an array of the intermediate results:

Descriptive and Summary Statistics

count Number of non-NA values
decsribe Compute set of summary statistics for Series or each DataFrame column
min, max Minimum and maximum
cumsum Cumulative sum of elements starting from 0
cumprod Cumulative product of elements starting from 1

Extras

Calculating all Statistics using describe

Instead of specifying each statistics, the method describe is useful if you want to get all the essential statistics at once.

Calculation Speed: Custom Formula vs Pandas Methods

While you can always calculate manually for mean or standard deviation, we will see which method is faster to compute both of them.

Calculation Speed: Custom Formula vs Pandas Methods

While you can always calculate manually for mean or standard deviation, we will see which method is faster to compute both of them.

©2023. All rights reserved.  Samy Baladram,
Graduate Program in Data Science - GSIS - Tohoku University
Google Sites
Report abuse
Page details
Page updated
Google Sites
Report abuse