Pretty Terrifying Project | DHUM 70002 Digital Humanities: Methods and Practices (Spring 2026)

Data Management Plan

What data will you collect or create?

This project produces a curated dataset examining feminist themes within horror video games. The following data will be collected for each title:

Game Title
Wikipedia URL
Developers
Female Developer Present
Release Date
Platforms
Horror Subgenre
Player Perspective
Female Protagonist Playable Character?

Themes and keywords related to:

Motherhood
Domestitcity
Trauma and Mental Illness
Embodiment
Captivity
Violence
Sexualized Violence
Girlhood
LGBTQ+
Creed Archetypes
Suggested Supporting Evidence

How will the data be collected or created?

Data was collected through a process that began with a Wikipedia web scrape using Python, BeautifulSoup, and wikipediaapi to gather all pages and subcategories within the Category:
Horror_video_games. Data such as the game’s title, URL, and category were extracted to compile a curated list of games featuring female characters. A classifier was then built using control phrases and/or keywords to identify games featuring female characters and potential feminist themes. Data collection is ongoing and continues to be reviewed using both computational and manual methods.

Documentation and Metadata

Basic documentation and process methodology will be provided.

Documentation will include the following:

Data_Dictionary
Methodology
Basic Software Requirements
README

Ethics and Legal Compliance

How will you manage any ethical issues?

This project does not contain any personal information of individuals. All materials are publicly
available.

Storage and Backup

How will the data be stored and backed up during the research?

During the research and development stage of this project, all information will be stored in cloudstorage using the following platforms:

Google Drive
GitHub

Both will have regular updates by team members and will be automatically stored.

How will you manage access and security?

The final dissemination of this product will be a public-facing digital website accessible to all. All data collected will also be openly available. During the research phase, only team members and the faculty advisor, as needed, will be granted access to the platforms used.

Selection and Preservation

Which data are of long-term value and should be retained, shared, and/or preserved?

The dataset of horror games and their feminist themes itself
The data visualizations created from that dataset
The code for the website presenting those visualizations
Any accompanying documentation and written material connected to the project

What is the long-term preservation plan for the dataset?

We will store the dataset in a data repository such as Kaggle, and will look into institutional
repositories like CUNY Academic Works for the preservation of other aspects of the project.
Additionally, we plan to keep the website accessible for as long as possible, using free hosting on GitHub Pages.

Data Sharing

How will you share the data?

Final data will be shared via GitHub and openly available to the public for review or research
purposes

Are any restrictions on data sharing required?

Data produced through this project will be available under the Creative Commons License: CC-BY