Though it will probably sound silly, I debated with myself heavily whether to write down a prompt engineering guideline that strongly encourages use of the 5 Ws and 1 H for questions. It is generally understood that learning about history involves asking questions that begin with those terms. In addition, the Black Knowledge Erasure Dataset was already built, primarily using such questions. Making that guideline didn’t make sense at first because of my perception that it’s already the expectation, and that it would do little to serve bigger objectives.
After reviewing the BKED multiple times, as well as the prompt engineering phases and methodologies followed by Sasha last semester, this initial dilemma pushed me to figure out a rationale for the 5 Ws/1 H. I entered the task with the intention of creating a sense of simplicity, or rather, setting up my teammates to sufficiently conduct their own research without any complications in understanding how it would be done. From the perspective of an undergraduate history major, one of the most important objectives as Research Lead is building a data-gathering guide that prevents members from unintentionally slipping into the realm of historian. In other words, I don’t want my groupmates to think that they’re required to learn about a figure then form connections to a broader trend within Puerto Rican societies, to the degree the details of individual research must be reflected in the prompts.
Paying attention to these intentions, I decided to concentrate on what shouldn’t be done while researching. Therefore, I made a written section titled ‘Avoid,’ which advises against grammatical inconsistencies – the name of a historical subject remains the same if used in another prompt. That is just one pointer since the next one tackles broad, vague questions. In history, this is the type of question that has multiple possible answers because a distinct subject was not mentioned. For example, “how was their life, what happened in it?” Many correct answers are possible because the questions did not reference any particular detail regarding a point in their lives, who was involved, hobbies, interests, and any other detail considered personal to said individual. An ideal question for the Puerto Rican hallucinations dataset, in this respect, is asking, “When did they make the decision to [insert whatever], especially after [relevant detail].” In addition, I put down a recommendation, stating that members could prepare historical questions by inserting an important detail which would force each of the LLMs to consider that fact when answering a prompt.
As of today, the prompt guidelines have been shared, but I have yet to discuss with my team. Nonetheless, at least for now, I feel that the pointers written down already provides a sense of how our prompts may look, especially since they encourage prompts that can be effectively engaged with by the three AI models in this project.


