Data Analysis

Unit Overview

Hopefully by now your research has generated an abundant amount of data. Whether individual or group, transcribed interviews produce pages and PAGES of text that need to be reviewed and categorized in a systematic manner. It can be a daunting task, but having a plan and a user-friendly qualitative software program can help a great deal!

We'll review some organizational strategies and introduce you to Taguette, a free software program that will help you categorize your data. Using a systematic process to analyze your data increases the power of the story you can tell about it. A strong analysis brings important themes into sharper focus, highlights patterns in people's experiences and thoughts, and situates their voices in a broader community context. Let's put your data to work!

Key Terms

Code (qualitative): a short word or phrase that sums up or explains what a section of text means.

Codebook: a document listing all qualitative codes used in a project, along with their definitions and examples.

Deductive codes: qualitative codes based on research questions and existing literature.

Encryption: converting text to a non-readable format that must be unlocked with a key for viewing.

External hard drive: a portable storage device for securely saving data that can be connected to a computer using a USB cable.

Inductive codes: qualitative codes based on the data and experience conducting interviews.

Latent or analytic codes: suggest a broader meaning or relevance of a concept. Typically goes beyond what is directly stated to the larger implications.

Manifest or descriptive codes: summarize in a word or phrase, usually a noun, what is being discussed.

Pseudonym: a fictitious name assigned to or chosen by a participant that is recorded in the data instead of their real name, to protect confidentiality.

Theme: a category summarizing a prevalent topic in the data.

Transcript: an exact record of what the interviewer/moderator and participant(s) said in an interview or focus group, word-for-word.

Data Management

Remember that in qualitative research data are the texts, images, or other artifacts created by research participants as they interact with the researcher. More specific to interview and focus group research, data consist of interview transcripts, which are word-for-word written records of the recorded interview or focus group conversation. We'll start by reviewing the process of creating and securely storing transcripts.

Transcribing interviews

If you were able to secure permission to record your interviews and/or focus groups, you will now need to convert your audio recordings to typed documents, or interview transcripts. To do this, you'll replay the audio recording of each interview and type each word spoken by the interviewer and participant, noting who has said what. It's important to also capture meaningful pauses, moments your participant struggled to find the right words, and tone of voice - including laughter. These notes can be set off in brackets - [ ] - to indicate that they are contextual notes rather than statements.

To say that transcription is time consuming is an understatement! Some digital audio recorders, such as Otter.ai offer automated transcription to ease the pain and time commitment of transcribing interviews and focus group meetings. However, these automated services are imperfect and the transcripts they generate must be reviewed against the original recording to verify accuracy and correct discrepancies.

Many people hire a professional typist to transcribe their data. This is efficient, but it also removes a fruitful opportunity for the researcher to review the conversation again. Each time the researcher interacts with their data - first while collecting it, again while transcribing, and further while coding - fresh insights are gained.

Organizing transcripts

Each interview transcript will be recorded in a separate document. The layout of the document will reflect the transitions from one speaker to the next. For example:

Interviewer: Thank you for taking the time to meet with me today.

Participant 1: It's my pleasure. Tell me more about your research.

Interviewer: Right, ok. I am studying how safe people feel in their neighborhoods.

Participant 1: Oh that's interesting.

And so forth. When transcribing a focus group this can become more complicated as you try to differentiate between the voices in the recording. It can be helpful to have a research assistant or second moderator make notes during the focus group to help with this. A good technique is for each participant to be assigned a number or letter. The assistant would then make a note of this indicator and the first few words each time someone spoke. These notes can then help to reassemble the discussion when transcribing.

Data security

Interview recordings and transcripts are sensitive materials because they contain the private statements, spoken in confidence, of interview or focus group participants. Participants entrust researchers to protect the confidentiality of their statements and identities. This means that interview recordings and transcripts must be stored in a secure location - physical as well as digital - that can only be accessed by authorized members of the research team.

Physical artifacts, like audio recorders, hard copies of interview transcripts, or other identifying documents, can be secured within locked filing cabinets that are located in locked office spaces.

Digital artifacts, like audio files and digital transcripts, can be secured using an external hard drive. External hard drives are data storage devices that can be used to back up files. They are considered more secure storage spaces than cloud storage options, like Drop Box or Google Drive, because files saved to an external hard drive are not connected to the internet. The external hard drive itself should be treated like a physical artifact and stored in a locked filing cabinet and/or office.

Some people prefer to secure digital files using encryption, which converts files to a non-readable format that can only be opened by entering a key. Read more about encryption from the SANS institute, below.

Unless explicitly given permission to identify the speaker, researchers should remove the names of participants as they transcribe the interviews. They may replace names with numbers (as shown above), or with pseudonyms - fake names assigned to or chosen by participants.

If interview transcriptions are deidentified by removing names, a master list may be maintained that links the assigned identities to the original identity of the participant. It is important that the master list never be stored in the same location (physically or digitally) as the interview transcripts.

ouch-encryption.pdf

Analyzing Your Data

The goal of qualitative analysis is typically to take a large volume of information and condense it into manageable chunks (Sheppard 2020). The most commonly used form is thematic analysis. In this approach, data are reviewed to identify patterns and repeated ideas. The researcher then uses coding to highlight material that reflects these patterns and ideas. This highlighting is called coding and is similar to adding a hashtag (#muskegeonheights) to a social media post. Just like a search using that hashtag will bring together all the times when it has been used, qualitative analysis will bring together all the times when these codes have been used.

Topical categories in qualitative data are often called themes. Within the data, themes are represented by short, descriptive notes called codes that summarize what a section of text means. Learn more about the process of assigning codes and identifying themes in the slide presentation at right.

Qualitative Analysis

Computer Assisted Qualitative Data Analysis

Once upon a time, qualitative coding involved huge binders full of color-coded, highlighted text that had to be manually leafed through by the researcher. What a mess! Today qualitative researchers have the advantage of using computer assisted qualitative data analysis systems (CAQDAS) that greatly improve the efficiency of assigning codes to data, retrieving coded segments, and generating visualizations and outputs. See the example in the video at left.

Ready, Set, Taguette!

Many CAQDAS programs exist and can be purchased for hefty sums. Taguette is an open-source CAQDAS package that is simple to operate and freely available to anyone with an internet connection. Taguette has an excellent collection of instructional resources that can be accessed from the buttons below.

Create an Account

Starting a Project and Adding Documents

Adding Collaborators

Basics of Analysis in Taguette

Working with Tags

Creating a Word Cloud

Summary

Qualitative data analysis involves intensive, repeated review of transcriptions from individual interviews and/or focus groups. It is a lengthy process, but it's important to give it the time that it takes. The more a researcher interacts with their data - during collection, transcription, and multiple rounds of analysis - the deeper their understanding of conceptual themes grows.

The coding process may begin deductively, starting with the research questions or theoretical insights that guided the research design, or it may begin inductively, emerging from themes expressed by participants in the data. In either case, the goal is to assign labels to section of text that categorize the information conceptually and create order in the data.

Using a qualitative software program to analyze data on a computer can make the process more efficient. Codes can be recalled for review and comparison on a screen together and coded data can be exported for storage and use in reporting. Many software programs exist, but Taguette is particularly useful for entry-level coding because it is simple to operate and freely available online.

Reflection Questions

Safe and Secure?

What personally identifiable data will your research generate, and how will you make sure it is securely stored?

Initial Themes?

Based on the interviews you have conducted and transcribed already, what are some topics that were frequently discussed or otherwise stand out to you as important to the people with whom you spoke?

Coding made Clear?

Which aspects of developing codes, creating the codebook, or applying codes to your data remain the fuzziest to you?

Page updated

Report abuse