Category Archives: Group Project Updates

The AI “Hallucination” Project Work Plan

Project Scope Statement

The AI “Hallucination” Project is a semester-long digital humanities research initiative about how large language models distort or erase the histories of marginalized communities. The project will produce two parallel annotated datasets — one focused on 19th to 20th century Black history (BKED) and another on 19th to 20th century Puerto Rican history— generated by running structured query prompts across GPT-5, Gemini, and Claude AI models and subjecting outputs to systematic human fact-checking and annotation. The final deliverables are a public-facing explorable website cataloguing hallucination instances by type and community, and a white paper documenting methodology, data collection procedures, and findings. The project is scoped for one semester, with all deliverables completed by May 1, 2026.

Scoping and Scheduling the Work

Translating the project’s conceptual goals into a concrete, time-bound work plan has been one of the more clarifying exercises of the semester so far. The Gantt chart we developed maps the full arc of the project across four broad phases: pre-production and outreach, dataset development, analysis and development, and final production. Each phase carries its own dependencies and risks, and understanding how they connect is essential to keeping the project on track.

The pre-production phase, running from late February through mid-March, focuses on establishing the structural and logistical foundations the rest of the project depends on. This includes finalizing the Project Work Plan and Data Management Plan, conducting an outreach and distribution strategy discussion, and beginning early community-facing work such as scheduling a consultation with a GCDI fellow and preparing promotional materials for a NYC Open Data Week presentation. In parallel, development of the project website skeleton (comprising the About, Methods, and placeholder Search Interface pages) begins on March 10th, with a dependency on the Designer/UX role providing sufficiently finalized design direction beforehand.

The dataset development phase constitutes the methodological core of the project. Beginning March 17th, the team will collaboratively draft approximately forty-five (45) query prompts drawn from verified Puerto Rican historical sources. Prompt writing is followed by a split execution strategy: The first fifty percent of raw model queries will be run between March 24th and 27th, at which point the Research Lead will review and approve the outputs before the full batch proceeds. This checkpoint is an intentional quality control measure, designed to catch errors in prompt design or model behavior before they propagate across the entire dataset. The second half of the query execution is scheduled for April 10th through 14th, immediately following spring break, allowing the team to return to the work with fresh perspective and sufficient time for the intensive annotation and verification work that follows.

The post-spring break period from mid-to-late April is the most logistically demanding phase of the project, with several parallel workstreams converging in a short window. Between April 13th and 17th, with advice from the research lead, the team will build a fact-checking and annotation workflow, compile and clean the verified dataset, and flag hallucination instances with accompanying notes. These tasks are largely sequential and directly gate the development work that follows: comparative hallucination rate charts cannot be built, and data visualizations cannot be integrated into the website, until both datasets are clean and verified. The design and UX role will additionally advise on chart style and color palette during this period, ensuring visual consistency across the archive. During this team, public-facing distribution and outreach strategies will launch: These will include strategic dissemination of the website URL and accompanying platform marketing efforts  — an Instagram page, Substack newsletter, or Are.na channel — depending on what’s decided in the development of the distribution strategy.

The final production phase runs from April 25th through May 1st and involves simultaneous completion of the project’s three major deliverables: testing and deploying the final website, populating and finalizing the explorable database on the live site, and writing the technical sections of the white paper (including methodology, data collection, and findings).

A Pretty Terrifying Work Plan!

Here’s our plan for this project! Generally, the journey of this project involves starting by further augmenting and analyzing our dataset, then creating the website and the visualizations that will live there, then iterating on and improving our work, while also engaging in outreach and developing presentations and supplementary materials. The week-by-week breakdown is as follows.

Week 1:

2/26 – 3/5

  • Divide into 3 sections and review which data points may be missing and require additional evaluation. (Truly: rows 2-32, Michael: rows 33-63, Naila: 64-94)
    • Review how Barbara Creed’s frameworks apply to the video games in the dataset
    • How is the monstrous feminine theory, and how does it apply to the horror video games genre
  • Team to add notes and comments directly to the Google Sheet version of the dataset, and to any other notes in our team document.
  • Look for patterns and connections, and think about ideas for visualizations.

Week 2:

3/5 – 3/12

Truly

  • Deploy the website via GitHub

Michael & Naila

  • Tie up loose ends in the dataset review and continue thinking about data visualizations/potential platforms.
  • Find other video game analyses on the internet. What does the game studies ecosystem look like?

 

Week 3: 

3/12 – 3/19

Michael

  • Website wireframe

Naila

  • Decide on outreach platforms and begin outreach – show what the project will look like

Truly

  • Work on drafting data visualizations

Week 4:

3/19 – 3/26

Truly

  • Code website skeleton
  • Begin working on website documentation

Naila

  • Revise data visualizations
  • Additional research as needed

Michael

  • Revise data visualizations
  • Additional research as needed

 

Week 5 & 6 (spring break)

3/26 – 4/16

Break time and/or time to catch up on tasks. TBD.

 

Week 7:

4/16 – 4/23

  • Finalize the name
  • Prepare slides and presentation
  • Website content drafting
  • UI and other visual assets

Project presentation day (Tuesday, 04/21)

 

Week 8:

4/23 – 4/30

  • Test and improve website accessibility.

Truly:

  • User test and debug the website, and add any additional web content
  • Ensure both the deployed site and the static copy of the site

Michael & Naila

  • Revise front-end (written) content

 

Week 9:

4/30 – 5/7

  • Prepare for the public project presentation
  • Working on slides and doing team rehearsal
  • Dress Rehearsal (5/7/2026)

 

Week 10 & 11

5/7 – 5/21

  • Revision on all fronts.
  • Final outreach push leading up to the project launch
  • More TBD.
  • Public project launch at GC Digital Showcase (5/21)

Work Plan for “The Voices of Lunfardo”

This project will proceed in three phases. Participants will meet every Wednesday through Zoom meetings. Phase 1 will last three weeks (March 5-26). In this phase, the members of the team will finalize the system selection and installation. They will complete the collection of terms and develop the narratives corresponding to each term. In phase 2 (April 2-16), the members will create term pages and prepare brief biographical profiles. Finally, in phase 3 (April 23-May 7), the participants will start a preliminary outreach, revise and customize the site, and conduct  a full project rehearsal. On May 14, the team will present the project at the GD Digital showcase.

Phase 1 (March 5-26)

During the first phase, the team will meet on Wednesdays in one-hour long Zoom meetings. By March 5, the group will finalize the system selection and complete the compilation of terms. By March 12, Aaron will procure system hosting and prepare a preliminary interface design, while Natalia will write the narratives of seven terms.  By March 19, Natalia will conclude the term narratives, while Aaron will install and configure the system. At the weekly Zoom meetings, the team will discuss the progress on the interface and the narratives. Aaron will provide feedback on the narratives; Natalia will provide feedback on the interface. By March 26, Aaron will review the system configuration, while Natalia will revise and edit the term narratives. They will both start conducting initial outreach.

Phase 2 (April 2-16)

During this phase, the team will enter term pages and write their biographical narratives. They will continue holding weekly Zoom meetings during which participants will review details and offer feedback on project progress. By April 2, 5 term pages will be completed and entered into the system. By April 9, the group will enter five additional terms. Finally, by April 16, the team will complete the entry of all terms as well as the biographical narratives. 

Phase 3 (April 23-May 7)

During the last phase, the group will continue meeting via Zoom. The participants will focus on outreach activities. The week of April 23, the team will continue with outreach of the provisional site. Aaron will revise the system, while Natalia will revise the narratives. By April 30, the team will start outreach of the final site, while Aaron will focus on customization, with particular attention to the site’s visual design and user interface. Finally, by May 7 the group will conduct a full project rehearsal. On May 14, the project will be presented at the GC Digital Showcase.