In Module 01, we have learned to import a record sheet into jamovi.
However, this record sheet contains only raw data (e.g., raw scores for each item within a scale, as they appear from the responses in the questionnaire).
We often need to pre-process the raw data (e.g., compute a "happiness score" based on the items under the "happiness scale") before we can perform any statistical analysis (e.g., finding out whether females are happier than males).
This module introduces some fundamental techniques in pre-processing data using jamovi.
Recoding a variable allows you to alter an existing variable (in jamovi, $source variable) into a different form, based on specific criteria, such as linear function (plus, minus, times, divide). As long as you keep the recode key in your codebook well, recode is something reversible.
One popular usage of recoding would be reverse-coding. Researchers usually adopt reverse-code items for their scale development. In a 5-point Likert scale of Basic Self-Control, higher item score indicates a higher level of self-control, while higher reverse-coded item score indicates a lower level of self-control.
As a result, to reverse the coding in an opposite direction, recode is needed.
A common use of the recode function is to handle reverse-coded items in a scale. For example, item 2 (BSC_2) in the Basic Self Control (BSC) Scale.
We will need to recode it before further data processing and analyses.
Q: How do we recode SINGLE reverse item?
A: We use the “Transform” function in jamovi.
[Note: In 00:21-00:30 of the video below, there is a "reverse coded formula: max. score range + 1 - raw score". This works for items with a minimum score of 1. In fact, a general formula for correcting reverse-coded item is: corrected score = maximum score + minimum - raw score. E.g., for an item ranges from 0 - 10 with minimum score = 0 and maximum score = 10, the formula for correcting reverse-coded item will be: 10 + 0 - raw score.]
There may be more than one reverse-coded items in a scale, for example, item 2, 3, 4 and 5 in BSC scale.
Instead of recode them one by one, we may want to do it altogether.
Q: How do we recode MULTIPLE reverse items?
A: We use the “Transform” function - multiple transform in jamovi.
If you want construct a new variable based on the values of some existing variable(s), you can use the compute function.
For example, in a course, we often have multiple assessment items, such as mid-term test, group presentation, and final exam. If you want to obtain the sum score for each student by adding up the three items, you can then use the compute function.
A single item score may not be very indicative on the entire scale. Therefore, we may need to compute the total score or average score for data analyses.
For instance, we use the total score of the BSC scale to do analyses.
Q: How do we compute the total score of BSC scale?
A: We use the “Compute” function in jamovi.
If you want to classify participants based on their scores in an existing variable ($source), you can use the transform function. Unlike recode, the action of transform a variable may not guarantee reversibility. By using transform, you can collapse a value into different groups based on certain cutoff criteria (see examples below), or you can also collapse different groups into a bigger group, depending on your interest.
For example, suppose you have a variable storing age of all participants. The transform function can allow you to classify participants into a nominal variable with values such as "Child (if age < 18)", "Adult (if 18 <= age <= 65", and "Elderly (if age > 65)".
GPA can tell us how students perform in general. But we still need to classify students into different honors at the end.
We may want to split them into groups by GPA for some specific data analyses.
Q: How do we split students into a new variable “Honor” from the variable “GPA” into?
A: We use the “Transform” function in jamovi.
If you want to select participants based on their scores in some variable(s), you can use the filter function. After such selection, jamovi will conduct analysis based only on the selected participants.
For example, if you want to select participants who are female, you can use the filter function.
Sometimes we may want to look at data which meet a specific condition only. For example, we want to look at data of female subjects only and perform analyses.
Q: How do we separate data of female subjects from male subjects?
A: We use the “Filter” function in jamovi.
!! IMPORTANT !!
In this example, the word "female" is written with quotation marks (i.e., " "). We use them whenever we want to tell jamovi that this is a value in a nominal scale. In some sense, these quotation marks tell jamovi that "this is not a number; it's a word". In more technical terms, this is called a "string" (as opposed to numbers). In general, this rule on using quotation marks to indicate that something you type is a "string" instead of a number applies to all functions whenever you give commands to jamovi (e.g., in recode, compute, and transform, you learn above, or in many other functions). [For your interest, in an even more general sense, this rule on quotation marks applies to many other computer software/systems/programming languages].
Erroneous data or nonsensical data may exist in the data set, especially for manual data entry. We want to find them out, and either correct them or filter them out before performing analyses.
Q: How do we find these erroneous or nonsensical data out in the data set?
A: We use the “Filter” function in jamovi.
!! IMPORTANT !!
In this example, the number 100 is written WITHOUT quotation marks (i.e., it is Age < 100, instead of Age < "100"). This is because we want jamovi to treat 100 as a number and compare every Age score to this number. Because Age has been set as a Continuous variable, jamovi understands the scores under Age as numbers. Therefore, jamovi can only compare Age with other numbers. In short, if the variable you asks jamovi to compare is a Continuous (i.e., numerical) variable, you should only compare them with numbers (i.e., withOUT quotation marks) and never compare them with strings (i.e., WITH quotation marks on the numbers).
If you write Age < "100" when Age is set as a Continuous variable (e.g., for interval or ratio scales), jamovi will not be able to perform the filtering based on Age, because "100" is a string to jamovi, not a number (see the important notes under 4.1 above). Try it yourself!
Now, if you think you're ready for the exercise, you can check your email for the link.
Remember to submit your answers before the deadline in order to earn the credits!