Research
Annaliese Paulson. Active Learning and Institutional Stratification: a Text as Data Approach to Quantifying Differences in Postsecondary Curriculum. [Working Paper]
Student engagement and the use of active learning assessments have been longstanding marks of quality instruction in higher education. Building on a descriptive research tradition on student engagement and active learning practices, I argue that instructional practices offer one mechanism explaining the relatively robust literature on the positive effects of college “quality.” Drawing on a novel longitudinal panel of 830,000 syllabi linked with student administrative data from 30 public four-year institutions in Texas, I first develop and validate scalable measures of active and passive learning assessments applying supervised machine learning with a transformer-based large language model to the text of syllabi. I use these measures to document substantial variation in active learning assessment practices across public universities in Texas. I show that institutional instructional spending likely plays a role in emphasis on some active and passive learning assessment: relative to institutions that spend the most on instruction per student, courses offered at institutions that spend less are up to fifteen percentage points less likely to require written assignments and up to thirteen percentage points more likely to require tests, exams, and quizzes. Further, I show that freshman that matriculate in the institution that spends the most on instruction enroll in as much as twelve percentage points more courses that require writing assignments as a proportion of their total course-taking and as much as thirteen percentage points fewer courses that require tests, exams, and quizzes. I argue that exposure to active and passive learning assessments is stratified – students attending the best resourced institutions are more likely to encounter evidence-based teaching practices that promote student engagement, active learning, and student success. However, I also document noteworthy exceptions to this broader trend: despite spending $14,000 less in instruction per student than the state flagship, I find that Prairie View A&M University – a public historically black college and university – is more likely to require some forms of active learning assessments, offering insight into a potential mechanism through which HBCUs may affect student outcomes.
Annaliese Paulson, Kevin Stange, and Allyson Flaster. (2024). Classifying Courses at Scale: a Text as Data Approach to Characterizing Student Course-Taking Trends with Administrative Transcripts. (EdWorkingPaper: 24-1042). Annenberg Institute at Brown University. https://doi.org/10.26300/7fpa-s433 [Hugging Face Model Collection] [Google Colab Notebook Demo]
Students’ postsecondary course-taking is of interest to researchers, yet has been difficult to study at large scale because administrative transcript data are rarely standardized across institutions or state systems. This paper uses machine learning and natural language processing to standardize college transcripts at scale. We demonstrate the approach’s utility by showing how the disciplinary orientation of students’ courses and majors align and diverge at 18 diverse four-year institutions in the College and Beyond II dataset. Our findings complicate narratives that student participation in the liberal arts is in great decline. Both professional and liberal arts majors enroll in a large amount of liberal arts coursework, and in three of the four core liberal arts disciplines, the share of course-taking in those fields is meaningfully higher than the share of majors in those fields. To advance the study of student postsecondary pathways, we release the classification models for public use.
Annaliese Paulson. Measuring the Liberal Arts in the Age of Big Data: Lessons and Opportunities from Computational Social Science. Invited Book Chapter. In Richard Arum, Allyson Flaster, and Paul Courant. (Eds.), The Liberal Arts Advantage: Measuring the Deeper Value. [Book Chapter Draft]
Quantitatively measuring aspects of a liberal arts education has historically been challenging because the many of the mechanisms through which liberal arts education have been theorized to operate are difficult to measure in traditional quantitative surveys. Rapid transformations in data infrastructure, increases in computational capacity, and the increasingly prominent role of the internet and computers in mediating educational processes have transformed the kinds of data available to social science researchers. In this book chapter, I survey sources of novel digital trace data that can be used to quantitatively measure a liberal arts education at scale and argue for the value of doing so.
Annaliese Paulson. Measuring Breadth and Depth of Study Using Neural Embeddings. [Working Paper]
Although achieving breadth and depth of study have been long-standing goals of postsecondary education in the United States, relatively little work has examine how breadth and depth of course-taking affect student outcomes. In part, this is because it has been difficult to produce convincing measures of depth and breadth of similarity. Drawing on two neural embedding methods, doc2vec and course2vec, I use machine learning to learn quantitative measures of course similarity using the text of course descriptions and the structure of postsecondary administrative transcripts. I validate these measures by showing that they extrinsic validity as predictors of course attributes and show they capture intuitive dimensions of postsecondary curriculum. I then present exploratory regressions, describing the development of breadth and depth of study over a student's academic career.