AI image generators get a new safety test for hidden toxic text in memes

Generative AI models can be prompted with just a few words to insert offensive or discriminatory text messages into images. Aditya Kumar from the SPRINT-ML Lab at the CISPA Helmholtz Center for Information Security is investigating how such outputs can be reliably prevented. To address this, he developed ToxicBench, a test dataset that evaluates how well image-generating AI systems handle offensive inputs. He also created a fine-tuning strategy to adapt the models accordingly.

This post was originally published on this site

Skip The Dishes Referral Code

KeyLegal.ca - Consult a Lawyer Online in a variety of legal subjects