1.4. Central tendency of data and skewness

1. Concepts & Definitions

1.1. Descriptive and Inference Statistics

1.2. Variable Types: Qualitative x Quantitative

1.3. Types of descriptive statistics

1.4. Central tendency of data and skewness

1.5. Main measures of variability

1.6. What is an outlier?

2. Problem & Solution

2.1. Read and clean UK data

2.2. Build a Bar graph

2.3. Numerical measures

2.4. Box-plot graph for all data

2.5. Box-plot graph for selected data

Central Tendency of Data

Centrality measures are essential for exploratory data analysis. They all indicate the center of the data distribution but yield different results. Let's explain the meaning of each measure of centrality — mean, median, and mode:

The mean also called the average, is the sum of all observations divided by the total number of participants or cases (n).
The median is the mid-point in a dataset ordered from the smallest to the largest when n is odd. With an even number of data points, it’s the average of the values in position n/2 and (n+1)/2 (i.e., the two values in the middle).
The mode is the most frequently appearing data point. It is a useful measure when working with categorical variables.

These three measures could help to identify three graphical situations related to formats of a frequency curve and its related central tendency measures. as illustrated in the next Figure.

What is Skewness?

Skewness is a great way to measure the symmetry of distribution and the likelihood of a given value falling in the tails.

With symmetrical distribution, the mean and the median coincide. If the data distribution isn’t symmetrical, it is skewed. There are two types of skewness:

Positive is when the right tail is longer, most values are clustered around the left tail, and the median is smaller than the mean (Right Asymmetrical).
Negative is when the left tail is longer, most values are clustered around the right tail, and the median is greater than the mean (Left Asymmetrical).

Page updated

Google Sites

Report abuse