Class of 2025 majoring in Computer Science. From Southern California and loves to go to concerts and watch movies!
Class of 2025 majoring in Computer Science with a minor in Film/Media Studies. From Lancaster, PA and loves to crochet and eat!
Thao Nguyen (left) and Marina Anglo (right) standing in front of their research posters at the Susquehanna Valley Undergraduate Research Symposium (SVURS) 2024.
Sequence analysis is used to analyze sequences of states, capturing the order and timing of these occurrences. We present some results from sequence visualization and the analysis of sequence complexity.
The dataset we used was provided by the PA Department of Corrections (PA DOC) through a DOC process. It contains data from all of the PA state prisons with years ranging from 2004 to 2015. Our sequence analysis captures how many prisons a person was in, how many times they were in each security level, and how often the security level changes. We defined a “step” to be a single unit of movement from prison to prison. For example, a person convicted of crime may go from security level A to security level B, so security level A would be the first step and security level B would be the second step.
We used data of people with 10 or fewer steps because they made up approximately 90% of the data. Using people with more steps meant taking into account outliers, and having more difficulty with visualizing the sequences.
The ranking of the security levels from most severe to least severe:
MAX (maximum), CLOSE (close), MED(medium), MIN (minimum), DIAG (diagnostic), PSYCH (psychiatric)
Frequent transfers can make it more difficult for the people convicted of crime as they will be placed farther away from their family and hometown, impacting their mental health and rehabilitation.
Examining these changes in security levels may reveal how these individuals are managed based on perceived risk and behavior.
Black people make up 10% of Pennsylvania’s population, but 46% of people in state prisons are Black. White people make up 75% of Pennsylvania’s population, but 44% of people in state prisons are White. Comparing these two statistics, Black people are overrepresented in the state prison population and White people are underrepresented.
We wanted to examine the movements from prison to prison and changes in security level of people convicted of crime in Pennsylvania, specifically between racial groups (White, Black, Asian, Hispanic, Native American, and Other), for potential differences.
In the figures below, here are some of the abbreviations:
B: Black people convicted of crime
W: White people convicted of crime
V1, ..., V10: step number for a person
FIGURE 1: We used sequence index plots to visualize the movements of each person convicted of crime where on the y-axis, each slice is each person convicted of crime and their movements. For example with the White group, person #1 starts off in Medium security at step V1 and ends up at Maximum security at step V10.
FIGURE 2: We wanted to see if there were any noteworthy trends from the sequence index plots, so we aggregated it through the sequence distribution plots. In step V1 for both groups, the orange bar, which indicates maximum security level, is larger for Black people convicted of crime than White people. This means that Black people are more likely to start off at a maximum level prison than White people. NOTE: This might be due to a geographic confounder where, for example, Black people make up majority of Philadelphia's population and the nearest prison happens to be maximum security, so out of convenience the Black people convicted of crime in Philadelphia may go to that nearest maximum security prison.
Figure 1: Sequence index plots where the y-axis is each person and the x-axis is their chronological steps from prison to prison. Note that some people repeat security levels.
Figure 2: Sequence distribution plots where the y-axis is the relative frequency across all people and the x-axis is their chronological steps from prison security level to prison security level.
FIGURE 3: These transitions matrices are used to determine the likelihood a person from each racial group to go from one security level to another. The y-axis represents the current state of the person convicted of crime and the x-axis is the next state of the person convicted of crime. The more likely a person is to go from one state to another, the more red the square is. For instance, the percent at which a Black person convicted of crime transitions from a close security level prison to a minimum level security prison is 0.26, while for a White person convicted of crime it is 0.44. This means that White people are 18 percentage points more likely to go down from a close security level prison to a minimum security level prison.
FIGURE 4: These histograms show the transitions per person convicted of crime for both Black people and White people. The blue dashed lines are the mean number of transitions for each group. The blue dashed line is further to the right for Black people than White people, meaning that the mean is higher for Black people (2.925) than White people (2.587). This shows that Black people, on average, have a higher number of movements from prison to prison than White people.
Figure 3: Transition matrices for each racial group where it overlooks all the steps of each person convicted of crime. The y-axis is the current state of the person convicted of crime at step t, then the x-axis is the next state the person would be at step t + 1.
Figure 4: Histograms showing the transitions per person convicted of crime. The blue dashed line is the mean number of transitions. The number of transitions counts how often the states in a sequence change security level and is normalized relative to the length of the sequence.
There was a difference in how the security levels changed in the sequences between Black and White people convicted of crime:
FIGURE 2: Black people are more likely to start in a maximum level security prison than White people.
FIGURE 3: When comparing the transition matrices of White and Black people convicted of crime, it is evident that there is a difference in their trajectories. A Black person convicted of crime is …
18 percentage points less likely to go down from CLOSE to MIN compared to a White person convicted of crime.
13 percentage points less likely to go down from MED to DIAG compared to a White person convicted of crime.
14 percentage points less likely to go from MIN to MIN compared to a White person convicted of crime.
10 percentage points less likely to go from PSY to DIAG compared to a White person convicted of crime.
7 percentage points more likely to go up from MED to MAX compared to a White person convicted of crime.
12 percentage points more likely to go up from CLOSE to MAX compared to a White person convicted of crime.
FIGURE 4: From the histograms comparing sequence variability, we found noteworthy trends in transitions per person.
All other racial groups had a higher average number of transitions than White people.
White people had the lowest average of 2.587, while Black people had an average of 2.925.
These trends suggest a racial disparity in prison transitions in PA for people who are in our dataset.
Sequence analysis: Its past, present, and future, Social Science Research: The paper we looked at to learn about sequence analysis and based our visualizations off of.
Pennsylvania Profile: Information on Pennsylvania's incarceration system.
TraMineR: The package used in R to make analyses with metrics (transitions per person, longitudinal entropy, sequence turbulence).
ggseqplot: The package used in R to create the visualizations for our sequence analysis.
gplot2: The package used in R to create the visualizations of the metrics.
R: The language used for visualizations and data analysis.
Analyze sequences divided up by other protected classes such as sex, age, and mental health status.
Investigate possible geographic confounder for initial security level
Investigate potential time confounder in which longer sentences mean more transitions
Look into a different data set, reasons, where each value in the sequence is the reason for the move to a different prison.
Before working on this project, I had previous experience in R, but I have learned a lot more through this experience. Balancing the task of learning new R packages while creating visualizations and analyzing plots was challenging yet rewarding.
Initially, I had some knowledge of the US incarceration system, but this project deepened my understanding of the experiences the people convicted of crime. Not only did we did we create and analyze these plots, we were able to talk to the other faculty that were also working on the project. These other faculty members, from fields such as Geography and Computer Science, provided diverse perspectives that deepened our discussions. It was fascinating to hear about their work and see how interdisciplinary collaboration can enhance our understanding of these complex issues.
While putting together the research poster to create a cohesive "story" from the data, I realized the importance of presenting our findings in a way that highlights the real-world impact. This process emphasized the need for ongoing work to fully understand the complexity of data and its implications.
Overall, this project has been an enriching experience that combined technical learning and a deep social understanding. Not only did it enhance my data analysis skills, but it has also taught me to consider the human stories behind the data.
Prior to this project, I had some knowledge on the incarceration system in the United States. I learned a lot more about the experiences of people convicted of crime through this research. I had no idea that moving from prison to prison could have such a significant impact on those convicted of crime.
Creating the plots and seeing the trends confirmed my beliefs about the disparities within the incarceration system. I had seen statistics similar to the ones we found in our research, but knowing that these are numbers that we discovered ourselves is a different kind of experience for me--it feels much more real.
Learning how to make the plots with the sequence analysis framework in mind helped me improve my skills in exploratory data analysis (EDA). I had some experience with R before this project, but strengthened my skills in the language by exploring new ways to visualize data and make observations based on the data.
I feel that this project was a very valuable experience. Not only did I learn about data and how to deal with data, but I also learned to consider the stories behind the data we worked with, always remembering that the numbers we dealt with are real people.