Creating lists of forbidden keywords for chat or commentaries with AI
In online communities, maintaining a positive and respectful environment is crucial, especially in spaces like Twitch chat, YouTube comments, or social media threads. A keyword filter is a great way to help achieve this, and with the assistance of AI, it’s easier to set up a customized list that captures a wide variety of inappropriate language, from offensive slurs to spam phrases. If you ask ChatGPT for example for such a list, it will not give you a clear answer. Instead it will censor itself, see the attached screenshot. Another important aspect to consider is that some words, like “gay,” can have multiple uses. While such terms may appear in an insulting context, they are also essential for normal, positive communication. For example, if someone wishes to come out as gay, an overly restrictive AI filter might block that message in error. Multilingual contexts present a further challenge. AI systems trained primarily on English data can misinterpret words in other languages as English curse words. A simple example is the Luxembourgish sentence, "Dat ass richteg flott!" which means “This is really nice!” in English. Here, AI could mistakenly censor the word "ass" (which means "is" in Luxembourgish) as the English term "ass." This brings us to an important question: do the benefits of an automated filter outweigh these limitations? And, what countermeasures can we implement to prevent an overly restrictive AI filter?