Lecture 12
Introduction:
Measures of Central Tendency provides a comprehensive overview of the fundamental statistical tools used to describe the center of a set of data. This lecture covers the three main measures of central tendency, which are the mean, median, and mode, and explains how to calculate them and interpret their results. It also explores the advantages and limitations of each measure and offers guidance on selecting the appropriate measure for a given dataset.
In addition to covering the basic concepts, "Measures of Central Tendency" delves deeper into more advanced topics such as weighted means, trimmed means, and measures of central tendency for grouped data. This lecture also provides practical examples and real-world applications of central tendency measures in various fields, including finance, economics, psychology, and healthcare.
Whether you are a student, researcher, or practitioner in any field that involves statistical analysis, this lecture provides a valuable resource for understanding and utilizing measures of central tendency to accurately describe and interpret data. With clear explanations, numerous examples, and helpful illustrations, "Measures of Central Tendency" is an essential guide for anyone seeking to master this important statistical concept.
Lecture 12. Measures of Central Tendency
We have learned from the previous lesson the definition of Descriptive Statistics. It is define as a brief informational coefficients that summarize a given data set, which can be either a representation of the entire population or a sample of a population. Descriptive statistics are broken down into measures of central tendency and measures of variability (spread), and measures of relative position. Measures of central tendency include the mean, median, and mode,. Measures of variability include standard deviation, variance, minimum and maximum variables, kurtosis, and skewness, while measures of relative position include quartile, decile and percentile
Descriptive statistics, in short, help describe and understand the features of a specific data set by giving short summaries about the sample and measures of the data.
People use descriptive statistics to repurpose hard-to-understand quantitative insights across a large data set into bite-sized descriptions.
A student’s grade point average (GPA), for example, provides a good understanding of descriptive statistics. The idea of a GPA is that it takes data points from a wide range of exams, classes, and grades, and averages them together to provide a general understanding of a student’s overall academic performance. A student’s personal GPA reflects their mean academic performance.
Types of Descriptive Statistics
All descriptive statistics are either measures of central tendency or measures of variability, also known as measures of dispersion, and measures of relative position.
In this lesson, we will discuss the Measures of Central Tendency for ungrouped and grouped data, its importance, how to compute each averages, and when to best use them.
MEASURES OF CENTRAL TENDENCY
Definition:
Measures of central tendency focus on the average or middle values of data sets. It uses graphs, tables and general discussions to help people understand the meaning of the analyzed data. Measures of central tendency describe the center position of a distribution for a data set. A person analyzes the frequency of each data point in the distribution and describes it using the mean, median, or mode, which measures the most common patterns of the analyzed data set.
A measure of central tendency is a single value that attempts to describe a set of data by identifying the central position within that set of data. As such, measures of central tendency are sometimes called measures of central location. They are also classed as summary statistics.
There are three main measures of central tendency: the mean, median and mode. Each of these measures describes a different indication of the typical or central value in the distribution.
In the following sections, we will look at the mean, mode and median, and learn how to calculate them and under what conditions they are most appropriate to be used:
Measures of Central Tendency: MEAN ( x̄ ) [Ungrouped Data]
The mean (or average) is the most popular and well-known measure of central tendency. It can be used with both discrete and continuous data, although its use is most often with continuous data. The mean is equal to the sum of all the values in the data set divided by the number of values in the data set.
You may have noticed that the above formula refers to the sample mean. So, why have we called it a sample mean? This is because, in statistics, samples and populations have very different meanings and these differences are very important, even if, in the case of the mean, they are calculated in the same way. To acknowledge that we are calculating the population mean and not the sample mean, we use the Greek lower-case letter mu, denoted as µ:
Example:
Consider the wages of the 10 employees of TUNGAB Refreshment below in thousand (k). Solve for the Mean.
Using the formula of the mean, we have:
Substitute all the values of x, in our case, we have:
Obviously, the process of getting the mean is correct. However if we look at the computed value, it seems that the 30.7k might not be the best way to accurately reflect the typical salary of a worker, as most workers have salaries ranges from 12k to 18k. The mean is being skewed by the two large salaries. Therefore, in this situation, we would like to have a better measure of central tendency.
REMEMBER:
The mean has one main disadvantage: it is particularly susceptible to the influence of outliers. These are values that are unusual compared to the rest of the data set by being especially small or large in numerical value. When there are significant outliers in your data set, the mean loses its ability to provide the best central location for the data because the skewed data is dragging it away from the typical value
Measures of Central Tendency: MEDIAN ( x̃ ) [Ungrouped Data]
The median is the middle score for a set of data that has been arranged in order of magnitude. The median is less affected by outliers and skewed data.
Example:
Consider the wages of the 10 employees of TUNGAB Refreshment below in thousand (k). Solve for the Mean.
Using the formula of the median, we have:
We first need to rearrange that data into order of magnitude (smallest first). Then we have:
After arranging the data, locate the middle score to solve for the median. (Remember: if the number of data is even, locate the two middle score and get their average)
Since n = 10 (even number), then we locate the two middle score in our data set, we have:
Comparing the two averages, mean = 30.7k and median=15.5k. The computed average using the median can accurately describe the typical salary of a worker, since most workers have salaries ranges from 12k to 18k. Remember that in the case where outliers are significant in the data set, the median may be the best measure of central tendency.
FACT: The median (or mode) may or may not be affected by the outliers in the data set
Measures of Central Tendency: MODE (X̂) [Ungrouped Data]
The mode is the most frequent score in our data set. On a histogram it represents the highest bar in a bar chart or histogram. You can, therefore, sometimes consider the mode as being the most popular option.
An example of a mode is presented below:
Normally, the mode is used for categorical data where we wish to know which is the most common category, as illustrated below:
We can see above that the most common form of transport, in this particular data set, is the bus.
However, one of the problems with the mode is that it is not unique, so it leaves us with problems when we have two or more values that share the highest frequency, such as below: .
FACT:
Remember that the mode (or median) may or may not be affected by the outliers. However, the mode may not be the best measures of central tendency since there are cases that there may be two or more mode present in the data set or there is be no mode at all.
Example:
Consider the wages of the 10 employees of TUNGAB Refreshment below in thousand (k). Solve for the Mean.
Using the formula of the median, we have:
We first need to rearrange that data into order of magnitude (smallest first). Then we have:
Looking at the data above, there is no frequent score that appears in the data set. Therefore, there is no mode at all.
(Note: you can’t put a ”0” to represent ”no mode” since ”0” means something)
EXERCISE NO. 01. BINGO PROBLEM
Given the following BINGO CARDs below, solve for the mean, median and the mode:
BINGO CARD NO. 1
BINGO CARD NO. 2
BINGO CARD NO. 3
EXERCISE NO. 02 WORD PROBLEM
Find the mean of the first 10 positive odd integers
What is the median of the following data set?
32, 6, 21, 10, 8, 11, 12, 36, 17, 16, 15, 18, 40, 24, 21, 23, 24, 24, 29, 16, 32, 31, 10, 30, 35, 32, 18, 39, 12, 20
Identify the mode for the following data set:
21, 19, 62, 21, 66, 28, 66, 48, 79, 59, 28, 62, 63, 63, 48, 66, 59, 66, 94, 79, 19, 94
Measures of Central Tendency: MEAN ( x̄ ) [Grouped Data]
To solve for the Mean involving grouped data, we use:
Example:
The following table gives the frequency distribution of the number of orders received each day during the 50 days at the office a mail-order company. Calculate the Mean
Steps in Solving the mean involving grouped data
First: Locate the midpoint m, in the given data.
To find the midpoint, add the class limits and divide it by 2.
Example: (10 + 12)/2 = 22/2 = 11
so, we have:
Second: Once you have located the midpoint for each group, multiply it with its corresponding frequencies to solve for fm. Then find the summation of all the values under the column frequency (f) and fm (frequency*midpoint)
For example: 4 * 11 = 44
So we have:
Third: Solve for the mean by substituting all the values generated in step 2.
So we have,
So, the average orders received each day is approximately 17.
Measures of Central Tendency: MEDIAN ( x̃ ) [Grouped Data]
To solve for the Median involving grouped data, we use:
Example:
The following table gives the frequency distribution of the number of orders received each day during the 50 days at the office a mail-order company. Calculate the Median
Steps in calculating for the median involving grouped data:
a]. Create a cumulative frequency in your table. The cumulative frequency is calculated by adding each frequency from a frequency distribution table to the sum of its predecessors. The last value will always be equal to the total for all observations, since all frequencies will already have been added to the previous total; and
b]. Locate the median class. In locating the median class, calculate median class = n/2
c]. After solving for the median class, locate its value in the cumulative frequency column, in our case, n/2 = 50/2 = 25, the median class (25) is located in the third row under the cumulative frequency.
Note: To find the median class, we have to find the cumulative frequencies of all the classes and n/2. After that, locate the class whose cumulative frequency is greater than (nearest to) n/2.
d]. After locating the median class, create another column in your table for the lower limit, Lm
e]. Once the table is complete, locate the values of the following: (Refer to your median class in locating the following values)
Median Class
<CF (cumulative frequency)
Lower Limit (Lm)
Frequency (f)
Class interval (i)
In our case,
The median class is n/2 = 50/2 = 25
The <CF is the cumulative frequency before the median class, in our case, the median class' cumulative frequency is 36, and 16 is the number before it. So, the <CF = 16.
Lower limit of the median class, is 15.5
Frequency (f) of the median class is 20, and the
Class Interval (i) is 3. To solve the class interval (i), Find the highest value (21) and the lowest value (10) in the given class interval and divide it with the number of groups (4). In our case, (21-10)/4 = 2.75 or approximately equal to 3.
Once you already have the following values, substitute it to the formula and do the operation:
So, the median of the given data set is 17.
Measures of Central Tendency: MODE (X̂) [Grouped Data]
To solve for the Mode involving grouped data, we use:
Example:
The following table gives the frequency distribution of the number of orders received each day during the 50 days at the office a mail-order company. Calculate the Mode.
Steps in calculating for the mode involving grouped data:
To solve the Mode for Grouped data, one must locate the group with the highest frequency since mode is define as the most frequent score in the data set.
In our example, Number of orders, 16 -18 has the highest frequency, f (20). Therefore, the modal class is in the third row under the frequency.
So we have,
Once the you have located the modal class, identify the values of the following:(Refer to your modal class in finding the following values)
Once you already have the following values, substitute it to the formula and do the operation.
So, the mode for the given data set is approximately equal to 17.
EXERCISE NO. 03. WORD PROBLEM
Given the following data below: Calculate the median scores of students from the following distribution.
2. If the mean of the given frequency distribution is 35, then find the missing frequency y. Also, calculate the median and mode for the distribution.
References:
[1] Meyers, L. S., Gamst, G., & Guarino, A. J. (2016). Applied multivariate research: Design and interpretation. Sage Publications.
[2] Rosner, B. (2015). Fundamentals of biostatistics. Cengage Learning.
[3] Rumsey, D. J. (2017). Statistics essentials for dummies. John Wiley & Sons.
[4] Agresti, A., & Finlay, B. (2018). Statistical methods for the social sciences. Pearson.
[5] Bluman, A. G. (2017). Elementary statistics: A step by step approach. McGraw-Hill Education.
[6] Freund, J. E., & Simon, G. A. (2018). Modern elementary statistics. Pearson.
[7] Hogg, R. V., McKean, J. W., & Craig, A. T. (2018). Introduction to mathematical statistics. Pearson.
[8] McClave, J. T., Benson, P. G., & Sincich, T. (2016). Statistics for business and economics. Pearson.