Competency: Plans data analysis using statistics and hypothesis testing and presents written research methodology.
The Four Scales of Measurement
Data can be classified as being on one of four scales: nominal, ordinal, interval, or ratio. Each level of measurement has some important properties that are useful to know. For example, only the ratio scale has meaningful zeros.
Nominal Scale. Nominal variables (also called categorical variables) can be placed into categories. They don’t have a numeric value and cannot be added, subtracted, divided, or multiplied. They also have no order; if they appear to have an order, you probably have ordinal variables instead.
Ordinal Scale. The ordinal scale contains things that you can place in order. For example, hottest to coldest, lightest to heaviest, richest to poorest. If you can rank data by 1st, 2nd, 3rd place (and so on), you have data on an ordinal scale.
Interval Scale. An interval scale has ordered numbers with meaningful divisions. Temperature is on the interval scale: a difference of 10 degrees between 90 and 100 means the same as 10 degrees between 150 and 160. Compare that to high school ranking (which is ordinal), where the difference between 1st and 2nd might be .01 and between 10th and 11th .5. If you have meaningful divisions, you have something on the interval scale.
1. Ratio Scale. The ratio scale is the same as the interval scale with one major difference: zero is meaningful. For example, a height of zero is meaningful (it means you do not exist). Compare that to a temperature of zero, which, while it exists, it does not mean anything in particular (although admittedly, in the Celsius scale, it is the freezing point for water).
What is Descriptive Statistics?
Descriptive statistics are one of the fundamental “must-knows” with any set of data. It gives you a general idea of trends in your data, including:
· The mean, mode, median, and range.
· Variance and standard deviation.
· Skewness.
· Count, maximum, and minimum.
Descriptive statistics is useful because it allows you to take a large amount of data and summarize it. For example, let us say you had data on the incomes of one million people. No one will want to read a million pieces of data; if they did, they would not be able to glean any useful information. On the other hand, if you summarize it, it becomes useful: an average wage, or a median income, is much easier to understand than reams of data.
Sub-Areas
Descriptive statistics can be further broken down into several sub-areas, like:
· Measures of central tendency.
· measures of dispersion.
· Charts & graphs.
· Shapes of Distributions.
The charts, graphs, and plots site index is below. For definitions and information on how to find measures of spread and central tendency, see: Basic statistics (which covers the basic terms you will find in descriptive statistics like interquartile range, outliers and standard deviation).
Difference between Descriptive and Inferential Statistics
Statistics can be broken down into two areas:
ü Descriptive statistics: describes and summarizes data. You are just describing what the data shows: a trend, a specific feature, or a certain statistic (like a mean or median).
ü Inferential statistics: uses statistics to make predictions. Descriptive statistics just describes data. For example, descriptive statistics about a college could include: the average SAT score for incoming freshmen; the median income of parents, the racial makeup of the student body. It says nothing about why the data might exist, or what trends you might be able to see from the data. When you take your data and start to make predictions about future behavior or trends, that is inferential statistics. Inferential statistics also allows you to take sample data (e.g., from one university) and apply it to a larger population (e.g., all universities in the country).
What is Inferential Statistics?
Descriptive statistics describe data (for example, a chart or graph), and inferential statistics allows you to make predictions (“inferences”) from that data. With inferential statistics, you take data from samples and make generalizations about a population. For example, you might stand in a mall and ask a sample of 100 people if they like shopping at Sears. You could make a bar chart of yes or no answers (that would be descriptive statistics), or you could use your research (and inferential statistics) to reason that around 75-80% of the population (all shoppers in all malls) like shopping at Sears.
There are two main areas of inferential statistics:
Estimating parameters. This means taking a statistic from your sample data (for example the sample mean) and using it to say something about a population parameter (i.e., the population mean).
Hypothesis tests. This is where you can use sample data to answer research questions. For example, you might be interested in knowing if a new cancer drug is effective. Or if breakfast helps children perform better in schools.
Let’s say you have some sample data about a potential new cancer drug. You could use descriptive statistics to describe your sample, including:
· Sample mean
· Sample standard deviation
· Making a bar chart or boxplot
· Describing the shape of the sample probability distribution
Quantitative Data Analysis
At this time, you already know that data means facts or information about people, places, things, events, and so on, and when these data appear not in words, images or pictures, but in numerical forms such fractions, numbers, and percentages, they become quantitative data. To understand the numbers standing for the information, you need to analyze them; that is, you have to examine or study them, not by taking the data as a whole, but by separating it into its components. Then, examine each part or element to see the relationships between or among the parts, to discover the orderly or sequential existence of these parts, to search for meaningful patterns of the components, and to know the reasons behind the formation of such variable patterns.
Quantitative data analysis is time-consuming because it involves a series of examinations, classifications, mathematical calculations, and graphical recording. Hence, thorough and advanced planning is needed for this major aspect of your study. However, all these varied analytical studies that you pour into your research become significant only if prior to finalizing your mind about these activities, you have already identified the measurement level or scale of your quantitative data; that is, whether your study measures the data through a ratio or interval scale, not by means of nominal or ordinal scale because these last two levels of measurement are for qualitative data analysis. It is important for you to know what scale of measurement to use, for the kind of quantitative analysis you will do depends on your measurement scale (De Mey 2013; Letherby 2013; Russel 2013).
Steps in Quantitative Data Analysis
Having identified the measurement scale or level of your data means you are now ready to analyze the data in this manner (Badke 2012; Letherby 2013; Mc Bride 2013):
Step 1: Preparing the Data
Keep in mind that no data organization means no sound data analysis. Hence, prepare the data for analysis by first doing these two preparatory substeps:
1. Coding System
To analyze data means to quantify or change the verbally expressed data into numerical information. Converting the words, images, or pictures into numbers, they become fit for any analytical procedures requiring knowledge of arithmetic and mathematical computations. But it is not possible for you to do the mathematical operations of division, multiplication, or subtraction in the word level unless you code the verbal responses and observation categories.
For instance, as regards gender variable, give number 1 as the code or value for Male and number 2 for Female. As to educational attainment as another variable, give the value of 2 for elementary; 4 for high school, 6 for college, 9 for MA, and 12 for Ph.D. level. By coding each item with a certain number in a data set, you are able to add the points or values of the respondents’ answers to a particular interview question or questionnaire item.
2. Data Tabulation
For easy classification and distribution of numbers based on a certain criterion, you have to collate them with the help of a graph called Table. Used for frequency and percentage distribution, this kind of graph is an excellent data organizer that researchers find indispensable. Here’s an example of tabulated data:
Step 2: Analyzing the Data
Data coding and tabulation are the two important things you have to do in preparing the data for analysis. Before immersing yourself in studying every component of the data, decide on the kind of quantitative analysis, you have to use whether to use simple descriptive statistical techniques or advanced analytical methods. The first one that college students often use tells some aspects of categories of data such as frequency of distribution, a measure of central tendency (mean, median, and mode), and standard deviation. However, this does not give information about the population from where the sample came. On the other hand, the second one fits graduate-level research studies because this involves complex statistical analysis requiring a good foundation and thorough knowledge about statistics. The following paragraphs give further explanations about the two quantitative data-analysis techniques (De Mey 2013; Litchtman 2013; Picardie 2014).
1. Descriptive Statistical Technique
This quantitative data-analysis technique summarizes the orderly or sequential data obtained from the sample through the data-gathering instrument used. The analysis results reveal the following aspects of an item in a set of data (Morgan 2014; Punch 2014; Walsh 2010):
* Frequency Distribution – gives you the frequency of distribution and percentage of the occurrence of an item in the asset of data. In other words, it gives you the number of responses given repeatedly for one question.
* Measure of Central Tendency – indicates the different positions or values of the items, such that in a category of data, you find an item or items serving as the:
Mean – average of all the items or scores
Example: 3 + 8 + 9 + 2 + 3 + 10 + 3 = 38
38 ÷ 7 = 5.43 (Mean)
Median – the score in the middle of the set of items that cuts or divides the set into two groups
Example: The numbers in the example for the Mean has two (2) as the Median.
Mode – refers to the item or score in the data set that has the most repeated appearance in the set.
Example: Again, in the given example above for the Mean, 3 is the Mode.
ü Standard Deviation– shows the extent of the difference of the data from the mean. An examination of this gap between the mean and the data gives you an idea about the extent of the similarities and differences between the respondents. There are mathematical operations that you have to do to determine the standard deviation. Here they are:
Step 1. Compute the Mean
Step 2. Compute the deviation (difference) between each respondent’s answer (data item) and the mean. The plus sign (+) appears before the number if the difference is higher; negative sign (−), if the difference is lower.
Step 3. Compute the square of each deviation.
Step 4. Compute the sum of squares by adding the squared figures.
Step 5. Divide the sum of squares by the number of data items to get the variance.
Step 6. Compute the square root of variance figure to get the standard deviation.
2. Advanced Quantitative Analytical Methods
An analysis of quantitative data that involves the use of more complex statistical methods needing computer software like the SPSS, STATA, or MINITAB, among others, occurs among graduate-level students taking their MA or Ph.D. degrees. Some of the advanced methods of quantitative data analysis are the following (Argyrous 2011; Levin & Fox 2014; Godwin 2014):
a. Correlation – uses statistical analysis to yield results that describe the relationship of two variables. The results, however, are incapable of establishing causal relationships.
b. T-Test – the results of this statistical analysis are used to determine if the difference in the means or averages of two categories of data are statistically significant.
c. Analysis of Variance (ANOVA) – the results of this statistical analysis are used to determine if the difference in the means or averages of three or more categories of data are statistically significant.
Example: If the mean of the grades of a student attending tutorial lessons is significantly different from the mean of the grades of a student not attending tutorial lessons
d. Regression – has some similarities with correlation, in that, it also shows the nature of relationship of variables, but gives more extensive result than that of correlation. Aside from indicating the presence of the relationship between two variables, it determines whether a variable is capable of predicting the strength of the relation between the treatment (independent variable) and the Outcome (dependent variable). Just like correlation, regression is incapable of establishing cause-effect relationships.
Example: If reviewing with music (treatment variable) is a statistically significant predictor of the extent of the concept of learning (outcome variable) of a person