As we enter the final weeks of the semester, the energy surrounding the project right now is very much heads-down, get-it-done. A key highlight this week was a productive consultation with Lisa Rhody and Eunah Cho from the GCDI. This meeting has really helped the group refine our project framing, sharpen our core arguments, and identify potential future trajectories for this research beyond the current semester.
One of the bigger takeaways was solidifying how we frame our guiding question. Our research goes beyond merely identifying AI errors; it delves into the more productive inquiry of how AI systems can perpetuate biases, cause harm and erasure even when adhering to technical compliance standards. This specific framing will provide our final analysis with a more distinct critical perspective and substantially refines the overarching argument of the project.
The meeting also reinforced something we’ve been working toward from the beginning: making sure our critique is actionable. Documenting patterns of hallucination is valuable, but the project becomes genuinely useful when it points toward what those patterns reveal and what someone could actually do with that information. That’s the standard we’re holding ourselves to as we finalize everything.
The consultation also opened up some genuinely compelling threads for future research. One in particular was the idea of tracking confidence metrics in the language of the model outputs, specifically what it might reveal when a model delivers a hallucination with a high degree of apparent certainty. That confidence reading could potentially serve as a signal pointing back to gaps or skews in the training data itself, which is a thread worth pulling on in a future iteration of this research. We can not pursue it fully before the semester ends, but it is now on the map for what comes next.
With these conversations informing our direction, the team has divided up the remaining work for the upcoming week. Chris is leading the fact-checking push on the Puerto Rican history dataset. The whole point of building a verified, annotated dataset is that the verification actually holds up, so Chris is taking the time to do it carefully even under time pressure.
Michelle is handling the web hosting logistics, including working through the domain transfer issue we ran into after purchasing through GoDaddy. What we did not anticipate was a mandatory waiting period before a newly registered domain can be transferred to another host, which has added an unexpected wrinkle to getting the site live.
On the presentation and website side, the focus right now is polish. The structure is there, the content is taking shape, and what’s left is making sure everything we’ve built communicates clearly and holds together as a cohesive whole. The goal is a final product that speaks for itself, one that doesn’t require a lot of explanation to understand why it matters.
Tying everything together, we also have an upcoming meeting with Luke Waltzer, Director of the Teaching and Learning Center at the Graduate Center. The conversation will focus on the project’s potential as an educational resource, and thinking through how our critique of AI compliance and harm can function practically in an academic setting.
It’s been a long semester, and there’s still a lot to do in a short amount of time. But the team is focused, the roles are clear, and the finish line is actually in sight.