We relied on three public datasets to inform our analysis and visualization:
Gender Representation in Video Games by Brisa Palomar (Kaggle): A structured dataset that includes character-level sexualization indicators across top-selling games from 2012–2022.
Women in Video Games by Alice Corona (Github): Offers genre-specific insights by focusing on the “damsel in distress” trope across hundreds of games.
Predict Online Gaming Behavior by Rabie El Kharoua (Kaggle): Provides player demographic and behavioral data, allowing us to connect in-game representation with patterns of player engagement.
In addition to datasets, our project draws from a wide range of secondary sources, including peer-reviewed articles, books, nonprofit research, activist work, and museum archives. Please refer to our Bibliography page under the Data section for a complete list of references.
Our approach integrates both quantitative and qualitative methods:
Quantitative: We analyzed data from the above datasets, coding for variables such as frequency, narrative role, agency, and sexualization.
Qualitative: We examined narrative tropes and character portrayals across decades, placing them in historical and cultural context.
Secondary Research: Our analysis was supported by a wide array of academic and cultural sources, ranging from digital media studies to feminist history.
We approach our work through a feminist digital humanities lens, guided by:
The principles of Data Feminism—that data is not neutral and must be interrogated for bias and context.
Adrienne Shaw’s assertion that representation is not inherently positive or negative, but political and relational.
A reflexive awareness of our own interpretive role, including the ways confirmation bias and cultural assumptions shape our definitions of “empowerment” and “sexualization.”
To conduct our research and build this project, we used the following tools and platforms:
Programming & Analysis: Google Colab with Python as our primary coding language; libraries including Pandas, Matplotlib, and Seaborn for data cleaning, processing, and visualization.
Collaboration: Google Docs and Discord for organizing team communication, writing, and planning.
Website: Google Sites served as the platform for designing and publishing our project, allowing us to integrate visualizations, narrative sections, and external resources.
Our visual storytelling includes character imagery, graphs, timelines, and genre-based visualizations, all intended to make our findings accessible and engaging for a wide audience.
This website was designed with accessibility and clarity in mind. We prioritized:
Readable fonts and consistent color palettes
Clear navigation between tabs
Multimodal presentation of data and argument
Our goal was to present complex cultural and statistical analysis in a way that is legible to both academic and general audiences.
We would like to sincerely thank:
Professor Andressa Maia, GSI Elizabeth Dresser-Kluchman, and DIGHUM 100 Teaching Team for shaping our understanding of digital humanities and providing thoughtful feedback and guidance throughout the project process.
The creators and curators of publicly available datasets and open-access research, especially those hosted on Kaggle and GitHub, whose work made our analysis possible.
The scholars whose ideas, methodologies, and critiques form the foundation of our project.
Finally, we are grateful to each member of our team for their effort, creativity, and collaborative spirit.