Centrality measures are essential for exploratory data analysis. They all indicate the center of the data distribution but yield different results. Let's explain the meaning of each measure of centrality — mean, median, and mode:
The mean also called the average, is the sum of all observations divided by the total number of participants or cases (n).
The median is the mid-point in a dataset ordered from the smallest to the largest when n is odd. With an even number of data points, it’s the average of the values in position n/2 and (n+1)/2 (i.e., the two values in the middle).
The mode is the most frequently appearing data point. It is a useful measure when working with categorical variables.
These three measures could help to identify three graphical situations related to formats of a frequency curve and its related central tendency measures. as illustrated in the next Figure.
Skewness is a great way to measure the symmetry of distribution and the likelihood of a given value falling in the tails.
With symmetrical distribution, the mean and the median coincide. If the data distribution isn’t symmetrical, it is skewed. There are two types of skewness:
Positive is when the right tail is longer, most values are clustered around the left tail, and the median is smaller than the mean (Right Asymmetrical).
Negative is when the left tail is longer, most values are clustered around the right tail, and the median is greater than the mean (Left Asymmetrical).