Qualitative Coding
Breathe. You got this. It's more fun than you think.
Breathe. You got this. It's more fun than you think.
Read:
Neale, J. (2016). Iterative categorization (IC): a systematic technique for analysing qualitative data. Addiction, 111(6), 1096-1106. https://doi.org/10.1111/add.13314
Watch:
Introduction to qualitative coding (12:03 min); https://www.youtube.com/playlist?list=PLBzv_METoGRJ5lZ8sTh4GFHAl4WfYRYYV
Try it!: Practice coding using real, authentic data from the Healthy Oregon Project (Alvord et al., 2020)
Quizlet: Learn Qualitative Coding - Level 1 https://quizlet.com/540790258/match
Quizlet: Learn Qualitative Coding - Level 2 https://quizlet.com/540877451/match
Comfort and growth rarely co-exist. Staring at a dataset that you have to code can be very uncomfortable and rather overwhelming in the beginning. You will second guess yourself frequently throughout this process about whether you are doing it correctly. That is a good sign! Trust that it will come together as you work through the process.
Read first. Read, and ideally re-read, all transcripts or field notes before you start coding. This will give you a better sense of what is included within the data, rather than blindly making decisions.
Transparency in your documentation is gold. Rigor and reproducibility are the goal when coding qualitative data. Be transparent in documenting your process so others can follow the same steps to get to the same results. You’ll also find that you may forget some of the details, so having documentation will help jog your memory if you come back to a project after a long time away.
Give it time. Coding is incredibly time-consuming. Do not leave it for the last minute.
Iteration improves results. Your first codebook may be pretty close to what you have at the end, but it is likely to change as you code your data. Save revisions and iterations, rather than overwriting old files.
Trust your gut. Often when coding, an excerpt doesn’t seem to fit easily into any existing options. This is often a sign that something needs to change, be better defined, or a new code may be needed.
These step-by-step procedures were developed by Dr. Adrienne Zell.
It sounds simple enough, but the first step is to read (and re-read) your documents.
For a small data set, try to read all of them.
For a larger project, select about a third.
When: You can start doing this prior to finishing data collection.
Who: All team members working with the data should read all the documents/samples (not just the ones they collected). If you're working with the data, get to know it first.
As you read, take notes (also called memos). Do this individually (you can compare your notes later). These notes help to draft an initial list of possible codes.
You may have some codes that are pre-existing (already identified as important before data collection), and some that will “emerge” from the data.
How many codes should I have?
For a small project <20 (even closer to 10)
For a larger project, ~ 50 is manageable. The more you have, the harder it is to keep them separate in your head. Course faculty have found it easier to keep codes high-level and then dive deeper later (in the secondary analysis phase).
What is a code?
Something helpful to answering your research question
A theme that is pertinent to your research question
Something that is repeated in some or all of your documents
Something that stands out as worth investigating further
Something that shows relationships across the data
Can indicate a possible pattern across the data
A word, phrase, sentence, passage, paragraph….
Things, behaviors, feelings, people, concepts, opinions…..
Come together, and merge your lists. Discuss each of the codes. Stay focused on your research question.
Discussion ideas: Is it pertinent to research question? Is it too narrow or too broad? Do you have any “Must-haves” and favorites? Use these as prompts to decide which codes you want to keep and which can be merged or discarded.
You can always add codes back in later or come up with new ones! You can also merge codes. However, you don’t want to get too far into coding and then make changes because you will have to re-code, which is time-consuming.
Tip: Schedule coding meetings with your team to discuss.
Virtual Tools
Use a software that allows you to “vote” or sort codes online. You can do this individually after setting it up, and then come together as a team and review.
Google Jamboard works great, also Google forms or docs
Kardsort
Polling software (e.g., Poll Everywhere, Slido)
Or go 'old school' with post-it notes or paper cards (example below)
The OHSU-PSU School of Public Health is a joint, professionally accredited school. In 2021, it had 12 degree programs, 1,600 students, and 98 faculty. The school's academic outcomes are reported for its degree programs and tracks to institutions (i.e., OHSU, PSU) and accreditors (i.e., CEPH, CAHME). Each agency had its own set of competencies and learning outcomes.
OHSU: 12 graduation competencies
PSU: 8 undergraduate; 9 graduate competencies
CEPH: 22 for MPH, 11 for BS/BA, MS, PhD
Learning outcomes for 12 degree programs (Two BS/BA, 1 MS, 6 MPHs, 3 PhDs; n = 90)
Annual assessment reports for these different metrics are due at different times of year for institutions and could benefit from alignment across competencies. A group of program directors, faculty, staff, and students worked to align common themes needed for student training in public health across all reporting bodies. Six commonalities emerged from the morning workshop. The merged code list will be shared with larger faculty for further discussion and potential adoption.
1) Invite key stakeholders
2) Print codes (i.e., each group’s competencies/ outcomes on colored strips of paper. Cut out & make sets.
3) Provide an easy legend and orient to activity.
Some like to work in groups from the start; others like starting alone. Either is fine. Once folks have the hang of it, start working together and look for agreement.
Some codes/groupings are very clear and easy to align. Others take more discussion.
The merged code list can then be shared with your larger team or stakeholders for further discussion.
This is everything. EVERYTHING.
Your codebook provides definitions and examples for the codes you use (e.g., the codes you put into software). The goal is to have a final codebook that you can use to train coders, share with stakeholders, and even publish.
It is simple:
Create a table with three columns: code, short definition, example(s) (i.e., representative quotes)
For small projects, you can use a single table. For larger projects, you may have multiple tables.
When you really start coding, you can add new codes – but it’s tricky because you don’t want to spend a lot of time re-coding documents. These coding dictionaries are essential for reproducibility. Your coding meetings will help to define exactly what codes you'll be applying to your dataset.
How you code your data will depend on your research question. You have more flexibility than you think. The key is describing why you made that decision. Be transparent.
Examples:
Thematic analysis - most common; describes themes observed
Discourse analysis - the way people talk
Sentiment analysis - how people feel
Theory-guided - is there a theoretical framework informing the codes?
Practice-based frameworks
Universal Design for Learning (UDL) - Use the 9 guidelines to code your data
Rubric-associated coding - used a lot in education and assessment. Try one of the the 16 VALUE rubrics, which are open source, freely available, and have a lot of testing behind them.
You can include “qualifier” codes: these are typically double coded (layered with another code).
Examples: Barriers, Facilitators, Negative, Positive, Recommendations, Great quotes
Pro tip: Consider adding a "Come back" code to discuss later with the team; makes it easy to find later. This doesn't make it to the final data, but great during the process.
Consider engaging the community in code development. How can you bring in “members” of your population or other key perspective holders?
How can you add other points of view? (e.g., advisory board, community groups, Other researchers, colleagues, students, members of your population, but from other sites/areas)
Review research question, methods, draft codes together. Be careful of confidentiality throughout, but especially if reviewing transcripts, which can be de-identified?
How can your advisors be compensated for their time? Giving back financially, with your time, etc.
See examples of coding dictionaries cleaned up for publication (in manuscript appendices):
As you're coding, it can be helpful to know that you're not alone. Here are some sentiments from students as they learned to code qualitative data in a short activity in Winter 2022. These feelings are very common.
There are multiple ways of coding data.
It takes time to come to consensus. Creating coding dictionaries help to get team members on the same page.
Assign a “primary” and a “secondary” coder to each document.
If you are the only coder, you will want to code a document, wait for a period of time, and then come back to it to review and add codes.
Figure out what platform you want to use based on the scope of your project. You can code in Word using the comment function and highlighting. But with longer transcripts or more complex projects, sometimes it's worthwhile to call in the bigger guns. Qualitative analysis software is easy to use and lets you focus on the code applications. In our class, we use Taguette for coding transcripts from interviews and focus groups.
Taguette - Freely available online and great for collaborative research. It is easy to use, doesn't have a lot of extra features, and has good support/documentation.
Getting started (Taguette guide)
Once you develop your codes, you can export to Excel or another program for secondary analysis of codes.
Other Coding Platforms
ATLAS.ti
NVivo 12
Dedoose
OHSU library has qualitative analysis workstations
For a team project, you will want to make a project and add your team members as project collaborators. This process will differ across software packages.
Make sure your documents (e.g. transcripts) are named/labeled with a consistent naming convention.
Remember privacy. What can you blind in advance of upload?
Some software packages will allow up to upload a list of codes and definitions from another document, such as Excel. For others, you will need to manually add them. In Taguette, you will add them manually.
Have each team member code 1 or 2 documents, and then come together to talk about the process. What worked? What didn’t?
Each person can code one document as a primary coder and one document as a secondary coder
Choose a variety of documents/transcripts to start.
During your coding meeting, discuss the codes. Do you want to add any? Merge any? Edit any codes or definitions?
Make those changes (document them in your coding dictionary!) and then review the coding you previously completed.
Depending on the size of your project, continue to have periodic coding meetings to resolve any issues.
Coding can take time. A lot of time. Try not to plan to do too much in one day. It’s easy to get fatigued and start skipping over codes.
While you are coding, begin to take notes on emerging themes and analysis strategies. These are sometimes called coding “memos”.
What are common themes?
What is surprising?
You can start discussing these in coding meetings.
When to start - If you are not finished with data collection, you can load an initial set of documents for the first coding round. It is best to finish data collection before you finalize your codes and start coding – but if you have a very structured interview guide or a deductive approach that uses a specific framework, your codes will likely not change a lot and you can get started earlier.
Indexed Coding vs. Open Coding - The method above is primarily “indexed” coding, or structured coding where you are applying a set of codes, some of them very specific, to all the documents. “Open coding” is more free form (often used in grounded theory approaches), and allows the codes to emerge entirely from the data – you are creating codes as you go. The ability to use indexed coding while also maintaining the option for adding codes is “flexible coding”. These are not strict definitions – most approaches contain elements of one or the other.
What do do with survey data? Each document may have a set of demographic descriptors (“attributes” or respondent characteristics). These could include the setting, the type of respondent (“teacher”, “student”), etc. These can come from data you collected about the sample, including survey data, or from the narrative. You can code for these in the software, in a spreadsheet while coding, or you can manage them later in a spreadsheet.
Optional Readings:
Deterding, N. M., & Waters, M. C. (2021). Flexible coding of in-depth interviews: A twenty-first-century approach. Sociological Methods & Research, 50(2), 708-739. https://doi.org/10.1177/0049124118799377
Tavory, I., & Timmermans, S. (2014). Abductive analysis: Theorizing qualitative research. University of Chicago Press. [guidance on balancing inductive and deductive approaches]
All your codes are applied. Hooray! Now it's time to pull them out of the coding platform for subsequent analysis. This process supports audit of code applications and begins the analysis process.
Finish your coding and export your coded text by code so you have a report for each code.
As you do your analysis, you may want to run additional reports, such as those with overlapping codes.
Look for codes that may need further sub-coding, which you can do outside of the software, such as in Excel. Complete this sub-coding if necessary.
Look for relationships among the data and participant attributes/characteristics. These can be from your survey data or from coded attributes.
You want to dig deeper into a specific theme or question
You have a code with a large amount of text that needs to be broken down into smaller units of analysis
You have two similar codes and you want to apply the same set of sub-codes (maybe a new question or theme idea came up in coding)
You are interested in the prevalence of specific codes, but you know you have some double coding and you want to clean it up
You want additional documentation for important themes
The step of "how do you get from your coded data to results" is a frequent question of students. The short and frustrating answer is the process is iterative. Consider the aims of analysis and go back to your research question(s). What outcomes are expected from the analysis?
Developing conceptual definitions
Developing typologies and classifications
Exploring associations between attitudes, behaviors, and experiences
Developing explanations of phenomena
Generating new ideas and theories
Use these goals to frame your analysis.
Themes are recurrent concepts which can be used to summarize and organize the range of topics, views, experiences or beliefs voiced by participants. The themes should have a strong basis in the data (i.e., internal validity).
Deductive: Themes are derived primarily from pre-existing theories, topics, literature, etc.
Inductive: Themes are derived primarily from the data (used in Grounded Theory)
You will likely use a mix of both.
Read through all your exported reports. Take notes.
Meet with your team and review their notes and thoughts.
Then go back and re-read the transcripts. Repeat steps 1-2 until a set of themes have emerged.
Working with the code excerpts from the full data set is called cross-case analysis (it is most common). Used for answering: What assumptions, ideas, thoughts are commonly or less commonly held?
Working with the person-level transcripts is within case analysis. Used for answering: How do individuals create meaning?
Look for codes that may need further sub-coding, which you can do outside of the software, such as in Excel. Complete sub-coding if necessary.
Look for relationships among the themes and participant attributes/characteristics. These can be from your survey data or from coded attributes.
Once you've finished secondary analysis, you're ready for matrices.