Three Heads are Better Than One

Sasha’s Project Journal (2/25/2026)

Revising the project proposal turned out to be more clarifying than I expected. What started as an exercise in tightening our language became a real opportunity to interrogate the “what” and “why” behind what we’re actually building. The revision also gave us a chance to solidify something we’d been leaning toward: expanding beyond the original Black Knowledge Erasure Dataset (BKED) to include a parallel dataset on Puerto Rican history. We deliberately narrowed it there. Latin American history is a whole can of worms on its own, and questions like what even counts as ‘Latin American’ could swallow the project whole if we let them. Puerto Rican history gives us a focused, specific entry point without opening up more scope than one semester can reasonably hold. Putting it into the formal proposal made the reasoning click into place. Comparing how epistemic erasure operates across different diasporas gets us much closer to saying something meaningful about the pattern, rather than just documenting isolated incidents. It also raises another question: does erasure look the same across different communities, or does it take distinct forms depending on how a group has been historically represented (or misrepresented) in the sources these models were trained on?

Furthermore, working through this revised proposal with the team has been really eye-opening, because having three people invested in the same problem brings out angles I wouldn’t have caught on my own. The research lead is keeping our sourcing rigorous and our historical grounding solid. The design and UX ideas are already pushing us past the “spreadsheet” version of this project toward something that could function as a real public-facing archive. And the outreach and documentation brainstorming is something I think often gets underestimated: making our work legible to people outside the digital humanities world. If we want this to land as an Open Educational Resource, that translation will matter enormously. I’m feeling excited about the direction we’re heading and feel genuinely more confident about our momentum now. And it’s great that everyone is running with their piece of this and running with it well.

Side note – a conversation I had outside of class this week has been sitting with me in a way that keeps complicating things, in the best way. I’ve been thinking about the tension at the core of this project: we’re using archival sources as a “gold standard” to fact-check AI, but history itself has never been a neutral body of objective facts. Who tells a story, from where, and with what stakes all shape what gets recorded and what gets left out. That’s pushed me to think more carefully about the difference between asking who the AI fabricates versus why it fabricates the way it does. The “who” gives us our dataset. The “why” is where the deeper argument lives. We’re not here to ding AI for getting things wrong. We’re here to document how these systems imagine race, gender, and sexuality through an unexamined kind of creative writing, and to make that visible in a way that’s genuinely useful.

Nuts and Bolts

This week I have less to share, but there’s been progress to some extent nonetheless.

One of the joys of automation is that moment, having spent the time putting the pieces in place, you flip the switch and see all of those pieces do exactly what they’re supposed to do.

And of course one of the major sorrows of automation is when you flip that switch and nothing happens, or not everything goes according to plan.

When we met last week, Natalia and I agreed I should stand up our chosen system on a hosting option that we had more control over. Presently that means it’s under my own hosting environment, which I keep online for a range of other things (namely two blogs, a virtual tabletop system, some various other web materials, and my Calibre web server). This will suffice as we continue to prototype and determine how best to architect the information presentation. Whether it suffices for a longer term is unclear.

Unfortunately for us, though, I have missed my target for the week because when I flipped the switch to turn on the site at its new, perhaps temporary home, it didn’t work the same way it did when I was testing it. Something is wrong, and figuring out what will set me back a bit. Fortunately, this should not be insurmountable, merely inconvenient. All the pieces are in place, but one seems to be misaligned.

When I fix this, I assume it will be useful to document what I did so others following this path will benefit from it.

Meanwhile, we will hammer out our more formalized week-by-week work plan, and Natalia is continuing to collect and document entries elsewhere in anticipation of the site.

Out of Order

We continue on our journey through the horrors. Is there anything more terrifying than the realization that you have been going about this all wrong? That you are not as well guarded from the monsters than you thought? That something has been lurking in the shadows this whole time…

That you’ve done stuff in a weird order?

“What do you MEAN most people actually have a plan for their terrifying quests and don’t just roll up to a haunted house they find to look around and go from there?”

Let’s take a step back for a moment and get into the context. Last semester’s Intro to Digital Humanities class ended with a final project – to create a proposal for a larger project that we could then build on further this semester. Alternatively, we could just write a paper instead, and then just jump on someone else’s project when the time came. Pretty simple, right? Except- well, last semester, for the first time ever, there was a secret third option – the dataset. Last semester, people had the option to put together a dataset, along with some documentation, with the idea that this data could be used for future research. The dataset wasn’t a formal project proposal but it was still a piece of a project – a piece of many potential projects, even, considering that the same dataset can be used for numerous purposes.

This new option ended up getting a lot of mileage – two out of the three teams this semester, including the team I’m on, have started not with a proposal, but with a dataset. We’re in brand new territory for this methods and practices class.

Usually, I would imagine, with a project starting from a proposal, you have a pretty exact idea of what you’re creating. Natalia’s Lunfardo dictionary for example, this semester’s one project that did emerge from a proposal, has a clearly defined output: an online dictionary to serve as a learning resource. My own project proposal from last semester was similar: its planned output would have been a digital timeline of a particular historical event. Based on my experience last semester, the “pre-work” for proposal-based projects mostly consisted of doing just enough research to refine your initial idea into something that you could be sure would work, then focusing the rest of your energy on planning out the process, on how exactly you will get it to work. Once you actually do the project, you’re filling the empty vessel you created, by doing further research, creating the content you planned to create, and presenting that to your audience.

But what happens when you start with the filling but don’t have the structure to contain it yet? Well, I guess what happens is what I described in last week’s post. But it did take me until this week to realize that wasn’t actually the intended progression. Which was admittedly a little alarming. Sometimes horror is about establishing clear rules… and then punishing even the people who follow them perfectly. In those cases, the fear comes from this idea that you can’t escape the terrible thing no matter what you do. But other times, horror establishes the rules so that when someone breaks them…

You immediately know they’re doomed.

Thankfully, this is not the case for digital humanities projects. In fact, one of the main ideas I took away from last semester’s class was that everything is worth questioning, including the rules themselves, and the conventional wisdom that goes with them. Breaking a rule may cause a few headaches – in our case having to put together a proposal relatively quickly when we would have already had one had we gone down the traditional path – but it’s still completely workable. And things also balance out down the line- for example, later on, we won’t need to compile a dataset for our project since we already have one.

It seems like this is the way our project will continue, diverging from the prescribed path in places, sometimes requiring more work, sometimes giving us more breathing room. From the technical side, this is already something I’m thinking about as I plan my work. Because we’re still figuring out the exact form of our output, any digital tools I start setting up at this stage would need to be flexible enough to support a range of final products, until we pare things down and decide the details of our project more firmly. Which means right now my best bet is preliminary research for the basic structure and hosting of the site. Thankfully, I already had a lead on this. I’ve been wanting to look into Jekyll since I first learned about it in a discussion of minimal computing in my first class in this program. And this is finally my chance. A data vis project that doesn’t require a lot of social networking seems a perfect use case for creating a static webpage that can stand on its own. This is way easier to preserve in the long term, and also allows us to retain control over our project – we can keep our own copies of it so as to not be at the mercy of whether some faraway server shuts down.

Also, it’s called Jekyll, and we’re doing a project about horror, so that’s fun!

Getting familiar with some options for a basic framework for this project by looking into tools like Jekyll and GitHub Pages is a technical step I can take now. But beyond that, anything more specific will have to wait. Maybe as our ideas about the exact nature of our data visualization solidify, I can take a look at what data visualization tools exist for things like graph theory and network analysis, since that looks like the direction things may be headed, based on our discussion around seeing how themes in the games link to each other.

So hey, we aren’t doomed or cursed or anything like that, we just… entered the haunted house through the back door, I guess! Doesn’t mean we can’t still explore it just as well as if we’d come the conventional way. The flexibility of being able to break the rules to some extent and not be punished for it is something I really appreciate about DH.

Of course, when doing something official like filling out grant proposals… well, if you break the rules there, you really are doomed. Good luck! 🙂

The AI “Hallucinations” Project (Revised Proposal)

Team Members and Roles

Sasha Richardson: Project Manager, Technical/Dev Lead. Responsible for “momentum-making,” code architecture, repository management (GitHub), and technical documentation.

Christian Gilkes: Research Lead. Responsible for sourcing content/data, bibliography, and ensuring intellectual rigor.

Michelle Santiago Cortés: Outreach & Documentation Lead, Design/UX. Responsible for public-facing copy, visual identity, accessibility standards, and meeting notes.

All Members: Data Curators (responsible for querying LLMs, collecting responses, and identifying hallucinations).

Abstract

The AI Hallucinations Project is a digital archive and critical analysis tool documenting how Large Language Models (LLMs) fabricate or distort the histories of marginalized communities. While AI models increasingly function as informal historians, they frequently invent figures, misquote theorists, or erase narratives when processing complex humanities data. Existing technical benchmarks treat these “hallucinations” as bugs to be patched; this project reframes them as cultural artifacts that reveal algorithmic bias. Building on the Black Knowledge Erasure Dataset (BKED), which documents distortions in 19th and 20th century Black history, this project expands its scope to include a parallel dataset on Puerto Rican histories from the same time-period for comparative analysis. By employing controlled prompting across model families (GPT-5, Gemini, Claude) and verifying outputs against gold-standard archives, such as the Schomburg Center and the Library of Congress, the project will enable visualization of how epistemic erasure functions differently across specific diasporas. The final product will be a public-facing website featuring an explorable database, data reports, and visualizations, serving as an Open Educational Resource (OER) for educators and students.

Narrative

Enhancing the Humanities

The AI “Hallucinations” Project represents a crucial interdisciplinary initiative designed to enhance the humanities by fundamentally reinterpreting algorithmic errors. Rather than treating fabrication and misrepresentation by large language models (LLMs) as mere technical glitches to be debugged, the project elevates these errors into significant cultural artifacts worthy of critical scholarly analysis. Our core methodology involves systematically documenting how contemporary LLMs fabricate, distort, or outright erase narratives, figures, and historical events within the fields of Black American Studies and Puerto Rican Studies.This documentation creates a robust, empirical foundation that empowers scholars, educators, and librarians to actively challenge the perceived authority and neutrality of AI systems.

The project will produce and maintain an Open Educational Resource (OER) specifically designed to equip students and educators with the skills to navigate the complexities of AI-generated content. This initiative fosters a necessary level of media and technological literacy, encouraging critical engagement rather than passive acceptance of AI outputs. For researchers focused on algorithmic accountability and ethics, the project provides an invaluable, categorized dataset. This allows Critical AI Researchers to conduct granular audits of model behavior, moving beyond surface-level metrics to understand the deeply patterned cultural consequences of AI bias in downstream applications. Likewise, curators, archivists, and museum professionals can utilize the project’s database to proactively protect the integrity of digital archives and collective memory. The data serves as an early warning system against the silent but pervasive threat of digital misattributions, invented sources, or the algorithmic re-shaping of historical fact. Furthermore, the structured nature of the data allows Investigative Journalists and Explorers to filter AI “lies” by specific demographic categories, error types (e.g., fabrication, omission, misattribution), and topical focus. This capability enables targeted reporting on the real-world, quantifiable harms of algorithmic bias, transforming abstract concepts of AI ethics into concrete, actionable stories of injustice and systemic failure in public-facing technologies.

Environmental Scan

The rapid integration of Generative AI into educational settings has necessitated rigorous scrutiny of model reliability. The controversies surrounding these technologies led to studies of its consequences in recent years. A study from last year highlighted a paradox associated with Large Language Models (LLMs), noting how those technologies can produce anything that is false, even if the content — video, document, audio, text (or other digital file) — gives the impression of something legit. On the other hand, LLMs can be tools for detecting and eliminating falsities if programmed appropriately, thus demonstrating the dilemma around reliability (Park and Nan, 2025). Another study — also from 2025 — takes a supportive perspective by claiming that Artificially Intelligent tools sustain research across many disciplines. However, respective scholars acknowledge how real-world biases are easily represented in many data sources required by AI software to function (Madanchian and Taherdoost, 2025). Researchers at the Stanford Institute for Human-Centered Artificial Intelligence investigated the impacts of ubiquitous autonomous learning models on African American communities. According to their report, artificially intelligent software has proven useful in the medical field (especially for low-income patients), presents economic opportunities, and is a tool for advancing educational instruction. These prospects, however, can be hampered by algorithm bias and misinformation, especially when the AI technologies reflect discrimination embedded into its data sources (Djanegara, Elam, Kosoglu, Koyejo, Meinhardt, Nwankwom Watkins, Wald, Zaman, Zhang, 2024). These findings highlight the source of many controversies surrounding the implementation of AI systems into general classrooms and higher learning. Accuracy and trustworthiness are critical points in this debate, especially as slightly more people are utilizing AI to learn anything, including historical topics. For these reasons, this project aims to leverage the observations and findings when uncovering evidence of informational errors produced by Generative AI software.

Questions about the reliability of Generative AI in education continue to be explored, revealing plenty of errors and inconsistencies. The AI Incident Database, for example, is a website that documents broad safety harms. It essentially functions as a digital library containing news articles and other reported evidence of real consequences resulting from AI technologies across various industries and spaces, including education. Additionally, there are benchmarks such as TruthfulQA and HaluEval, that are designed to measure how models mimic human falsehoods in a general sense. Also worth mentioning is Gordon McKelvie, a professor at the University of Winchester, who experimented with Microsoft CoPilot. He was curious to observe how the tool would analyze data curated for studying King Edward VI. However, it mainly gave facts pertaining to the source used without making connections to broader English history (McKelvie, 2025). The research and projects described above were influenced by the widespread realities of “hallucinations” in AI-generated content. Proceeding with a methodology of documentation permits us to consider the various ways that current Artificially Intelligent software can be misleading. Furthermore, the observations from referenced studies offer tools and perspectives to consider, meaning that the details can be helpful in studying Generative AI flaws in studying Black American and Puerto Rican histories.

History of the project

This project builds upon BKED, a dataset designed to document how AI models like Claude, GPT-5, and Gemini distort 19th and 20th century African American history and culture through specific “hallucinations”. Rather than viewing these errors as random bugs, the project frames them as “epistemic erasure,” where algorithms invent authorities or omit key figures in ways that mirror historical discrimination. The dataset includes the original prompts, the incorrect AI responses, and human-verified annotations that identify exactly where the models failed against standard archival sources. This dataset will serve as a component of the ongoing AI “Hallucinations” Project; an initiative aimed at cataloguing the ‘creative writing’ tendencies of algorithms to reveal how they impact marginalized communities, namely African-Americans and Puerto Ricans. By treating these errors as cultural artifacts, the project allows technologists to analyze the ‘cultural consequences’ of AI fabrication.

Final Product and Dissemination

This phase of the AI Hallucinations Project will culminate in:

  1. The creation of a new dataset of documented AI hallucinations about Puerto Rican history of the 19th and 20th centuries.
  2. Short (500-800 words) critical essays about the anthropomorphising effect of the term “hallucination,” generative AI’s parasitic relationship to knowledge production ecosystems, and the methodologies used to develop the datasets.
  3. A primary visualization chart where users can scan through a catalogue of hallucinations, their prompts, and the tools that yielded them.
    1. plus additional visualization assets that summarize key findings
  4. A repository for both datasets (GitHub) to be made available for query and download.
  5. A website that will host all of the above.

Dissemination strategy will begin during the development phase with peer and community-based outreach: one-on-one, in-network conversations within our immediate communities of art and culture workers, historians, archivists, students, digital humanists, and tech workers. Given the subject matter of the histories guiding this project, Black American and Puerto Rican communities will also be at the center of our dissemination efforts, with special attention to the communities around the Schomburg Center for Research in Black Culture and El Centro for Puerto Rican Studies at Hunter College.

We know from being active participants in these communities that there is a rising pressure to adopt tools like ChatGPT, Gemini and Claude and that workers across industries are ill-equipped to assess the merits and harms of turning to such tools for reliable information while considering the implications of how such adaptations affect the greater ecosystem of knowledge production. The AI Hallucinations Project aims to document AI fabrications and record them as cultural artefacts. The website will serve as a site for critical thinking about generative AI and its impact on knowledge production.

Because the project will primarily live on the website, we are aware that once the momentum for the initial launch and outreach campaigns slows down, the project will live on its afterlife through the Wayback Machine, people’s browser Bookmarks, Are.na boards, public spreadsheets, and other link repositories organized by amateurs, hobbyists, and institutions alike. We do not think this is an undesirable outcome, in fact, this is what successful outreach and dissemination look like. For this stage of the AI Hallucinations Project, successful dissemination is when the link to the website earns its place among people’s personal collections of internet artefacts or among institutions’s list of recommended resources.  We will tease the project throughout its development by adding relevant readings, influences, inspirations and ideas to an Are.na board that will be shared publicly closer to launch.

In this spirit, the second phase of the dissemination and outreach strategy will consist of identifying 10-15 link repositories where the link to our project might be found by the intended audiences. Link repositories have boomed in popularity as a way to circumvent the narrowing effects of recommendation algorithms and commercial search engines. They live as websites listing links to other websites, public spreadsheets, Are.na boards, and paywalled recommendations. Individual users maintain their own collection of links on their browser or dedicated notes software, with some creators building entire communities and business models by treating their link repositories (usually containing product or travel recommendations) as commodities.

As we prepare to launch the website, we will assess the need for an Instagram or Substack presence, depending on the findings from the earlier stages of outreach. At this time, we believe Instagram is the most accessible platform for all our target user communities, and see a potential for growth by continuing our community building there. We are also considering a newsletter format, which is more immediate and yields higher engagement with the added benefit of producing an email list that can be adapted for other purposes in the future.

Technologies

To achieve our Minimum Viable Product (MVP) by May, we must focus our learning and efforts on a core set of necessary skills. This includes basic proficiency in Python scripts for data collection, manipulation, and outputting JSON/CSV files. We also require a functional understanding of the GitHub workflow, covering essential commands for cloning, committing, pushing, branching, and managing pull requests to ensure proper version control and issue tracking. Finally, a crucial component is understanding how to structure and validate the collected data using JSON/CSV schemas to maintain consistency, along with fundamental web development knowledge (HTML, CSS, and potentially a simple JavaScript framework or static site generator) to display the data accessibly and meet universal design standards.

While ambitious, there are a few stretch goals that would be beneficial but are not critical for the MVP. On the web development front, this involves implementing a more dynamic front-end framework like React or Vue, or gaining deeper knowledge of a back-end language for database integration, which is likely beyond the scope of a rapid MVP. We also consider advanced universal design and accessibility testing, moving beyond the basics to implement sophisticated features and conduct rigorous testing, as a beneficial but non-essential stretch goal.

To ensure we deliver the MVP by the May deadline, we may strategically scale back in a few key areas, particularly in Web Development. Instead of building a highly dynamic, database-driven website, we will focus on a static or minimally interactive site that solely displays the collected, structured data accessibly. This means scaling back the scope to exclude complex search or filtering mechanisms beyond what a simple static site can handle, and eliminating the need for a dedicated back-end server beyond static file hosting.

Project Management

Tasks will be tracked via Google Docs & Google Sheets, while code will be managed on GitHub. Primary channels are text and email, with a commitment to check every 48 hours. Weekly sync meetings on Fridays at 1:00 PM.

Work Plan: Milestones and Deliverables

Date & Milestone Deliverable Specific Goal
Fri., Feb. 27th

(Project Work Plans)

Submission of detailed timeline Finalize the list of reputable sources to be used to develop queries for the Latino history dataset.
Fri., Mar. 6th

(Data Management Plans)

Submission of the schema.json and data dictionary Ensure the metadata schema accommodates both datasets (e.g., ensuring “cultural context” fields can distinguish between Black and Latino entries).
Fri., Mar. 13th

(Outreach/Social Media Plan)

Submission of the strategy for publicizing the project Draft copy that explains the nature of the project to a general audience.
Fri., Mar. 20th

(Project Website Draft)

Launch of the basic landing page A functional site structure that includes the “About,” “Methods,” and a preliminary search interface for the datasets.
Fri., Mar. 27th

(Pre-Break Update)

Status check on data collection Completion of 50% of the raw model querying for the new Latino dataset.
Fri., Apr. 17th

(Post-Break Update)

Adjustments based on initial findings Completion of human verification/fact-checking for the collected model responses.
Fri., Apr. 24th

(Final Stretch)

Finalizing the Data Visualizations Implementing the comparative charts showing hallucination rates between the two demographic datasets.
Fri., May 1st

(Final Project Update)

Final submission of the website, Explorable Database, and White Paper

Pretty Terrifying Project (working name)

Abstract
The horror video game genre, shaped by a male-dominated industry, has historically centralized masculine perspectives in both creation and representation. Women and the LGBTQ+ community are underrepresented both in production roles as developers and designers, and also in game content, where playable characters often portray characters through harmful tropes, such as sexualization and female monstrosity. While horror has been examined in film and literature studies, horror video games are underexplored as cultural artifacts. This project builds on an earlier phase of work on a constructed dataset horror_games_feminist_themes where keywords were web-scraped from Wikipedia’s Category: Horror video games tree to identify possible recurring feminist themes. This project now aims to refine and transform the dataset into a public-facing website that visualizes and interprets patterns that emerge from the dataset, making them visible and analyzable.

List of Participants
Naila – Project Manager
Lead project and meetings; organize and keep track of tasks and calendar; assist with other roles (visual design, development, etc); outreach on a social media platform

Michael – Visual Design
Identifying visual layout for data visualizations being created; assisting in the layout for UI of site; double-checking visual accessibility (wording, color contrast etc.)

Truly – Developer
Coding; documentation for the project; setting up website; assisting with research.

Enhance the Humanities
When we consider horror studies, scholars such as Barbara Creed argue that these genres encode themes of sexuality, reproduction, and maternity by framing the narrative of women as monstrous. Expanding further allows us to extend those ideas into tropes of fear and survival. While all of these scholarly frameworks provide crucial foundational research on femininity in horror stories and media, video games tend to be underexamined in the horror genre. The critical analysis of these feminist themes can provide meaningful engagement with how women are portrayed in these types of media. 

This project plans to extend how feminist horror theory can be considered through interactive media. It treats horror video games not only as mindless entertainment, but as an intervention of how industries induce fear through gameplay mechanics, female embodiment and player engagement. Unlike traditional film and books, where engagement is passively experienced, video games require participants to interact with its world, to embody a sense of vulnerability, survival, and curated limited autonomy. Fear is not only seen or heard, but in a small capacity, lived. By building an interactive dataset that highlights these themes present in horror video games, we gain new insights on how these narratives are presented in this unexamined media. 

Environmental Scan
Horror studies have historically been centered on film and literature; scholars such as Barbara Creed have analyzed the “monstrous-feminine” as a figure that is shaped by patriarchal fears about embodiment and reproduction. More recent scholarships extend these conversations into video games, interpreting monstrous female figures not simply as misogynistic constructs but as a resistance against the trope. Works such as Redefining the Monstrous-Feminine: Applying a Postfeminist (Eco)Gothic Reading to Horror Video Games by Jennifer Loring offer frameworks for interpreting witches, ghosts, and vampires as figures aligned with nature and with rebellion against patriarchal structures. Similar analyses of games like Doki Doki Literature Club! Examine how female antagonists disrupt player agency and destabilize typical male-driven themes and narratives in the horror genre (Graham 2025). While there is qualitative scholarship that critiques gendered tropes in horror, there are currently no existing datasets or data visualization projects on the subject. 

Final Product and Dissemination
Now let’s delve into what that deployment will look like. We believe that the process of creating a data visualization is itself scholarship – that in organizing content into a new form, we may reveal some novel insight into that content, or just take a new perspective by seeing the information through a different lens. For this reason, the final output of the project does not have to be a website that is solely available online. What makes this a digital humanities process is how digital tools serve to help us push and question our thinking throughout the process. So while one of the final outputs of this project will be a data visualisation hosted on a public website, another will be a lightweight static copy of the website that can be stored on drives or personal computers, and distributed that way. While this version of the project may lose some of the ease and interactivity of a fully online version, it will also be easier to preserve, as it will not be at the mercy of changing technological standards, and can be viewed with people without stable access to the internet. This way, a record of the scholarship remains, even if the online version itself is quickly left behind by technological development in a way that makes maintaining it untenable. The project is thus both more accessible in the present and accessible to future generations.

As for the online version, it could be hosted on a relatively simple website such CUNY Academic Commons, WordPress or Blogspot. Regardless of where it’s hosted, we would post it along with documentation and our justification for our project. If possible, it could be a good idea to post the static version of the site and a PDF version of the documentation for download there as well, to make the project more accessible in multiple forms.

Works Cited
D’Ignazio, Catherine, and Lauren F. Klein. Data Feminism. MIT Press, 2020.

Drucker, Johanna. “Humanities Approaches to Graphical Display.” Digital Humanities Quarterly, vol. 5, no. 1, 2011,

https://dhq.digitalhumanities.org/vol/5/1/000091/000091.html

Graham, Hannah. Metalepsis and Mental Castration: Doki Doki Literature Club! as the Cerebral Monstrous-Feminine. Georgia Southern University, Master’s thesis, 2023, https://digitalcommons.georgiasouthern.edu/etd/2976/

Loring, Jennifer. Redefining the Monstrous-Feminine: Applying a Postfeminist EcoGothic Reading to Horror Video Games. 2024. ResearchGate, https://www.researchgate.net/publication/393901879_Redefining_the_Monstrous-Feminine_Applying_a_Postfeminist_EcoGothic_Reading_to_Horror_Video_Games

The Voices of Lunfardo (Revised)

Abstract

The Voices of Lunfardo project seeks to create an interactive dictionary of fifteen lunfardo terms that are widely used in Argentina today. Lunfardo is the slang of the Río de la Plata region (Buenos Aires and Montevideo), originating in the late nineteenth and early twentieth centuries and shaped primarily by the daily experiences of working-class Italian immigrants. Many of its terms became popular through tango lyrics, and, over time, these words moved beyond tango and entered common speech, where they continue to evolve in meaning and usage.

Designed for college-level Spanish students, general Spanish speakers, and lovers of languages, the project situates lunfardo, often mischaracterized as criminal slang, as a language originating in the life and experiences of Italian immigrants. Each dictionary entry will present the lunfardo word, its standard Spanish equivalent, an English translation, sample tango lyrics, multimedia links to the song and video, and an explanation of its cultural significance. Some examples are “morfar” (“to eat”),  “mufa” (“bad luck”), a word that was very much used during the 2022 FIFA World Cup, and “falluto” (“fake”, “dishonest”). On each item page, students will find the lunfardo term alongside its standard Spanish equivalent and English translation. The page will also include tango lyrics featuring the term, with links to audio and video recordings, as well as explanations of its cultural significance in the context of tango and working-class identity. In addition, each entry will provide a link to current media sources, such as podcasts, news articles, or blogs, that demonstrate contemporary use of the term, accompanied by reflections on its historical evolution and cultural implications.

This project seeks to explore the cultural and historical significance of lunfardo through two central questions. First, it asks how the use of lunfardo in tango lyrics functions as a critical archive of immigrant working-class identity in Buenos Aires, capturing the voices and experiences of marginalized communities. Second, the project investigates how each term is used today and the role it plays in contemporary Argentine culture, exploring the ways in which these words have persisted or evolved while continuing to carry traces of their historical and cultural origins.

The Need

Online Lunfardo dictionaries and glossaries already exist. These resources are lexical repositories that list lunfardo terms and brief definitions, offering valuable reference tools for researchers and general audiences. However, they are not specifically designed primarily for Spanish learners and present vocabulary in isolation, without history, cultural context, or explanations of contemporary usage. In contrast, the dictionary proposed in this project is explicitly aimed at Spanish students and emphasizes contextualized and interactive learning. Each entry will incorporate tango lyrics and multimedia links to songs, explanations of cultural and historical significance, examples of current usage drawn from contemporary media, and interactive activities that encourage learners to practice lunfardo terms in meaningful present-day contexts. By integrating language learning, cultural history, and digital pedagogy, this project moves beyond a static glossary and presents lunfardo as a living and evolving component of Rioplatense Spanish. This project addresses these gaps by using Omeka, which allows for the creation of exhibits that function as a digital dictionary, while also supporting multimedia resources and interactive, pedagogically driven activities for student engagement.

Impact and Intended Results

Each term will be represented as an individual page on the Omeka platform, allowing for the organization of rich, multimedia content in a structured format. On each item page, students will find the lunfardo term alongside its standard Spanish equivalent and English translation. The page will also include tango lyrics featuring the term, with links to audio and video recordings, as well as explanations of its cultural significance in the context of tango and working-class identity. In addition, each entry will provide a link to current media sources, such as podcasts, news articles, or blogs, that demonstrate contemporary use of the term, accompanied by reflections on its historical evolution and cultural implications. By using Omeka in this way, the project will combine linguistic, historical, and cultural content in a single, navigable digital space, making it easier for students to explore the terms in both their historical and contemporary contexts. The platform also allows for future expansion and the integration of interactive activities, encouraging active student engagement with the material.

This project will make lunfardo more accessible to a wide audience, from students and scholars to Spanish learners and language lovers. By being presented in university conferences, workshops, and Spanish courses, it will support education and research. Additionally, sharing the dictionary on social media and Spanish-learning platforms will allow people around the world to explore lunfardo in a fun and interactive way.

The Plan

Phase 1: Research and Data Collection (February 2026). We will categorize terms by cultural significance, identify very well-known tangos that use the term, gather modern lunfardo uses interviews, podcasts, and YouTube sources. We will prepare media files: audio, video.

Phase 2: Omeka Platform Development (March 2026). We will install Omeka and configure plugins. We will upload collections and design interactive pages according to the following categories: Definition, Cultural Reflection, Video/Audio, Fun Facts.

Phase 3: Narrative Integration and Public Engagement (April 2026). We will write historical and cultural narratives linking expressions to social events and urban life. We will embed multimedia content: podcast clips, YouTube videos, images, and audio recordings. We will conduct user testing with collaborators and target audiences. We will refine the exhibit and prepare the final Omeka site for public launch.

Project Resources: Personnel and Management

List of experience and responsibilities of each staff member.

  • Natalia Bustos: research oversight and narrative.
  • Aaron Helton: Software evaluation, installation and configuration, deployment, and additional narrative, etc. 

Advisory Panel:

  • Oscar Conde – lexicography consultant.
  • Universidad Católica Argentina Spanish faculty – language validation.
  • Tango scholars – historical and cultural expertise.
  • Digital platform consultants

Final Product and Dissemination

The final product will be a hosted website serving fifteen exhibits on an Omeka installation. It will remain web accessible in its published format. Additionally, it will include a basic toolset in a GitHub repository that will facilitate long term maintenance, such as software updates and the capability of migrating the site elsewhere as necessary. The data itself will reside in the Omeka installation and will undergo regular backup to guard against disaster and maintain continuity. And finally, because Omeka is a full content management system, we will be able to sign in regularly to enter new terms, update or edit existing terms, or make other changes as necessary.

In addition to regular social media updates on BlueSky, which has attracted numerous digital humanities and other academic practitioners, we are submitting a proposal to the upcoming ACH conference, whose themes this year include questions of transnational challenges and how the digital humanities can meet those challenges. We also plan to reach out to the American Association of Teachers of Spanish and Portuguese to participate in some of their events and activities.

 

Participation and Promotion

I’ll be honest: I often look at the tech world’s obsession with “move-fast, break-things” work styles with suspicion. Why the fetish for being reckless and ignorant? I always wondered why one can’t move quickly and thoughtfully. I like knowing what I am going to do before I do it. And I usually trick myself into diving into projects by planning out every last detail until I’m suddenly in the thick of it.

But I welcome the changes this project brings, a trial-by-fire style of simultaneous thinking and doing. One Week, One Tool offered some assuring insight: You’re not suppose to know what you’re doing. Everyone had their own little bubbles to think about what they were doing, to meditate and check-in with themselves. I have a feeling that as long as I am doing something, it will be moving the needle somehow.

We met for some time last Friday. But that feels like time spent thinking together that in addition to the time we spent discussing the project in class, and we are familiar enough with each other from sharing a class room (or in my case with Chris, two classrooms) last semester. I think we have more warming up to do, it’s so hard to jump into conceptualizing and ideating a project cold that I can see how One Week, One Too’s approach to do-then-regroup is a more efficient in melding a group.

I am more excited to learn from this project than I am to contribute to it: I write tech criticism for a living, which includes of AI-hype bubble-bursting. (My first contribution to the project: open up the discussion on the merits and implication of using the word “hallucination.) But, shamefully, I have no hands-on experience working with chatbots and LLM’s. So I look forward to getting busy to do my share of data curation, to taking in, absorbing, learning instead of leading. Even in my my roles as documentation and outreach lead, I want to subordinate to the role of data curator. I think that is the ideal setup for any outreach or documentation lead — to be knee-deep in the work you are endeavouring to promote, or even “brand.” The promotion should extend naturally from the work being promoted and I have a lot of faith and enthusiasm for this project.

Prospects for Developing Unique Research Approaches

I took the past two weeks, of our team projects, as an opportunity to ponder how my research background can enhance the study of AI hallucinations. Matter of fact, Sasha’s initial project proposal caught my attention because of genuine concerns about increased reliance on AI-generated material for knowledge. There are real life examples – both reported and not – of individuals being misguided by AI-generated videos, images, text, or other content, affecting business and mass awareness. Thus, reasons for the controversies around AI-generating tools being present in education. Especially in contexts of studying history, there are individuals (including myself) that question the accuracy of some generated answers. Studying AI hallucinations of ethnic histories positions everyone involved to reinforce or decimate some current assumptions or ideas about AI in education, especially the teaching of past human experiences. 

 

Getting assigned to a research-leading role made me reconsider how any history is studied. Granted, the AI Hallucinations project is not particularly designed for explicitly studying a historical topic in detail. However, a journal article by Ryan Cordell (assigned last semester) asserted that knowledge and insights from disciplines outside of Digital Humanities are necessary (Cordell, 2016). Being involved with the project means leveraging past insights gained in relation to researching history because it is a process that ends with basic facts (of a person, event, or era) along with perspectives on what the topic means in relation to another or broader contextAI has been proven to function as a research tool, however, to what extent can it fully accomplish the mission of producing a narrative? While the project’s aims do not extend to specifically figuring out this question, potential findings could possibly be asked from such an angle.  

 

These reflections are already guiding my research. Finding other projects – with similar scopes of focusdata, and intentions – this week was not easy since Artificial Intelligence is still relatively recent. More specifically, academic perspectives on usage are also emerging. Nonetheless, I decided to seek out scholarship that discusses the impact of AI in education, society, and especially in the overall study of human histories. The initial searches led to sources that emphasize proven errors in AI systems, providing exceptional background details of the issues being addressed. The authors (found during preliminary research) share unique and opposing perspectives, thus, demonstrating relevant controversies. In addition, different types of errors are described; e.g., informational, source-reading error, bias. With the environment surrounding AI considered, studying its hallucinationfacilitates the conditions necessary for recognizing flaws in a technology that is spreading globally.  

What the Machine Forgot

Sasha’s Project Journal (2/18/2026)

Our team carved out some time outside of class to sit down and actually talk through what we’re doing together. It sounds simple, but there’s something really valuable about getting everyone in the same room (or virtual room) before a project picks up steam, just to make sure we’re all working from the same map. The core idea behind our project is something I find genuinely fascinating: instead of treating AI hallucinations as bugs or glitches, we’re reframing them as data worth studying. Specifically, we’re looking at how large language models distort or outright erase the histories of marginalized communities, and what that tells us about the biases baked into these systems. It’s a shift in mindset from “the AI made a mistake” to “the AI revealed something.” That distinction feels important.

I am particularly excited to dive into the technical side of this by developing an interactive user interface to explore our data. Since the project will eventually be a public-facing web archive, I want the searchable database to feel intuitive while allowing users to filter “lies” by demographic and error type. Creating data visualizations that illustrate how AI hallucinations vary across different cultural contexts will be a challenge, but a necessary one to move from a “glitch” mindset to “critical data”.

Also, getting outside perspectives on the project this week helped surface some of its more subtle aspects. We’re essentially treating AI hallucinations as cultural artifacts, that opens up a lot of questions we’re still working through. When an AI invents a citation, is that meaningfully different from when it invents a historical figure wholesale? What about erasure by omission, when someone’s history simply doesn’t appear, or appears only in distorted fragments? Are these all versions of the same problem, or do they need to be categorized differently? I don’t think we have clean answers yet.

Because I’m the person most familiar with the project at this point, especially the technical side and the prior dataset work, there’s an added responsibility on me to keep the structure clear and the vision consistent. That doesn’t mean it’s “my” project, but it does mean I need to help translate ideas into concrete next steps and make sure everyone’s aligned. As the project manager, I want to be clear about timelines and decisions without being overbearing. I want to keep us moving while still making space for everyone’s ideas to shape the project. I can already see how striking that balance may be more challenging than it sounds, but it feels like an important skill to build.

Right now, the main focus is keeping our workflow steady as we get further into the project. The first few weeks of any collaboration usually run on excitement alone, but that energy fades. I want us to build habits that will still work when deadlines stack up or things get complicated. For us, that means regular check-ins, being honest about what’s working and what isn’t, and paying attention to how the workload is distributed so no one quietly takes on too much. We’re still early in the process, but I feel confident about the team we’re shaping and genuinely interested in where the research will take us.

A Tangled Web of Tangled Webs

You are in a dark haunted mansion, the kind that seems like it may be full of endless branching hallways. You don’t know yet if it is though, you’re only in its lobby. Lighting illuminates the room as a storm rages outside. You can see torn wallpaper barely covering patches of mold. The floor is decaying in places, so you’ll have to watch where you step. You have the distinct feeling that somewhere in this strange place, something strange is lurking, and that you should really try and find it… before it finds you.

This could be the setup to a horror game, but isn’t it also kind of like the beginning of an academic project? The dark twisting mansion of a dataset to comb through, the pitfalls like: expanding your scope too much, running into technical difficulties, or being unable to display your data well. And the fact that really, you don’t know what you’ll find. If you start your project knowing exactly where it will end, then you aren’t really learning anything from what you’re doing, you’re just cherry-picking facts to support your own conclusions. This may be digital humanities, but the scientific method is a good model for looking at a project here too – you may start with a question to focus your research, and a hypothesis to test, but you can’t know what you’ll conclude at the end until you do your experiment – or in this case, your research and data visualization.

All that to say – these first few weeks of the project weren’t about diving into the horrifying halls of our dataset – not yet. We each took our own small peeks into a few rooms, but for the most part we stayed in the big menacing lobby to… hold a strategy meeting. We talked about our ideas, agreed on a general approach for moving forward, and then got on our computers (because miraculously, there’s wifi in this metaphorical storm drenched haunted house) to get some wisdom from our predecessors. After all, we’re not the first people to look at feminism, horror, video games, or even all three combined. But as our goal is to create something new and add to an under-analyzed area of media, we have to see what’s out there so we can build on it instead of just doing what’s been done already or worse, struggling around in the dark with no way to even start. I know that soon we’re going to have to hit the ground running, but I like that we have a moment to take stock of the dataset we mean to explore as well as do some research, not necessarily to include in the project directly, but to give ourselves a direction for our research and technological approach. And looking at various sources about the frameworks others have used around horror and feminism has given me a promising place to start hypothesizing. For example, reading an article about gothic literature got me thinking about what keywords from the dataset would most exemplify that subgenre – probably “domesticity” and “captivity”, with “violence” and “trauma and mental illness” potentially present as well. In addition, looking at Barbara Creed’s The Monstrous Feminine, which was one of Naila’s original inspirations for this, can also yield some archetypical combinations of keywords. For instance, the Archaic Mother concept could be expressed with keywords “motherhood” and “woman as monster” with “violence” and “sexualized violence” also maybe showing up. From looking at these existing frameworks then, we can hypothesize what combinations of themes may show up most commonly together in the dataset. And if we see something different than we expect, we can investigate why horror games might clump these themes together differently than what we see in other forms of media like novels and films. Finally, this might be out of scope of this project, but looking at what themes aren’t very commonly linked together seems like a good way to get some creative energy flowing. What would it look like for a story to connect themes that aren’t usually linked together, and how might that play with and subvert expectations? I’m excited to explore this data further, to make a project around it, and to just see what kind of cool thoughts it can inspire.

So how do you get through a haunted house without succumbing to The Horrors? Use the knowledge you can get access to as a jumping off point. Take the old map, scratched out on a rotting board by the previous survivors of this scary endeavor, and find new forbidden secrets where it leads.

You’ll need every bit of help you can to make it to the end.