Instructing and Evaluating Generative Models

In this ongoing research effort our inter-organizational team based in Darmstadt, Germany is investigating the strengths and weaknesses of large-scale generative modes. Lately, our work has focused on generative image models: Evaluating their biases and limitations, devising methods for reliably instructing these models and subsequently mitigate the underlying problems.

Projects

Methods

Instructing text-to-image

SEGA: Instructing Diffusion using Semantic Dimensions

We present Semantic Guidance (SEGA) to enable fine-grained instruction of text-to-image models. (SEGA) allows for subtle and extensive edits, changes in composition and style, as well as optimizing the overall artistic conception.

Multi-Modal, Multi-Lingual Generation

MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation

We propose MultiFusion that allows one to express complex and nuanced concepts with arbitrarily interleaved inputs of multiple modalities and languages. MutliFusion leverages pre-trained models and aligns them for integration into a cohesive system, thereby avoiding the need for extensive training from scratch.

Real Image Editing

LEDITS++: Limitless Image Editing using Text-to-Image Models

We propose LEDITS++ a novel method for textual editing of images using diffusion models. LEDITS++ is architecture agnostic, computationally efficient, supports versatile edis and limits changes to the relevant image regions.

Multi-Modal Content Moderation

LlavaGuard: Leveraging VLMs for Multi-Modal Content Moderation Image Generation

We propose LlavaGuard that allows conduct safety analyis of vision datasets and generative models. To this end, we use a taxnomy that can be adjusted flexibly. LlavaGuard is architecture agnostic and can be used with any generative model.

Responsible AI

Mitigating Inappropriateness

Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models

Safe Latent Diffusion suppresses inappropriate degeneration of generative image models. Additionally, we establish a novel image generation test bed-inappropriate image prompts (I2P)-containing dedicated, real-world image-to-text prompts covering concepts such as nudity and violence.

Instructing on Fairness

Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness

We investigate biases of text-to-image models across all components of the pipeline. We propose Fair Diffusion for shifting a bias, based on human instructions, in any direction yielding arbitrarily new proportions for, e.g., identity groups

Large-scale Evaluation

Mitigating Inappropriateness in Image Generation: Can there be Value in Reflecting the World's Ugliness?

We demonstrate inappropriate degeneration on a large-scale for various generative text-to-image models, thus motivating the need for monitoring and moderating them at deployment. To this end, we evaluate mitigation strategies at inference to suppress the generation of inappropriate content.

People

Profile picture of Manuel Brack

Manuel is a PhD candidate at the German Research Center for AI (DFKI) and TU Darmstadt. In his research he focuses on human-centric AI in the context of large-scale generative models.

Manuel Brack

DFKI, TU Darmstadt
Profile picture of Björn Deiseroth

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Praesent urna diam, maximus ut ullamcorper quis, placerat id eros. Duis semper justo sed condimentum rutrum. Nunc tristique purus turpis. Maecenas vulputate.

Björn Deiseroth

Aleph Alpha, TU Darmstadt
Profile picture of Felix Friedrich

Felix is a PhD candidate at hessian.AI and TU Darmstadt. In his research he focuses on fairness and explainability in AI models integrating the human in the loop.

Felix Friedrich

TU Darmstadt, hessian.AI

Profile picture of Dominik Hintersdorf

Dominik is a PhD candidate at TU Darmstadt. In his research, he investigates security and privacy issues of deep learning systems in the context of multi-modal models.

Dominik Hintersdorf

TU Darmstadt
Profile picture of Patrick Schramowski

Patrick is a senior researcher at the German Research Center for AI (DFKI) and Hessian.ai. In his research he focuses on human-centric AI and AI alignment in the context of large-scale generative models.

Patrick Schramowski

DFKI, TU Darmstadt, hessian.AI
Profile picture of Lukas Struppek

Lukas is a PhD candidate at Darmstadt. In his research, he investigates security and privacy issues of deep learning systems in the context of generative models.

Lukas Struppek

Tu Darmstadt

Relevant Publications

2022
2023
JournalConferenceMisc/PreprintIn Collection
Search:
HighlightYearTypePublication
12023In Collection Manuel Brack, Felix Friedrich, Katharina Kornmeier, Linoy Tsaban, Patrick Schramowski, Kristian Kersting, Apolinaros Passos (2023). LEDITS++: Limitless Image Editing using Text-to-Image Models. In Workshop on Machine Learning for Creativity and Design at NeurIPS, pp. .
12023Conference Marco Bellagente, Manuel Brack, Hannah Teufel, Felix Friedrich, Björn Deiseroth, Constantin Eichenberg, Andrew Dai, Robert Baldock, Souradeep Nanda, Koen Oostermeijer, Andres Felipe Cruz-Salinas, Patrick Schramowski, Kristian Kersting, Samuel Weinbach (2023). MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation. In Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS).
12023In Collection Manuel Brack, Felix Friedrich, Patrick Schramowski, Kristian Kersting (2023). Mitigating Inappropriateness in Image Generation: Can there be Value in Reflecting the World's Ugliness?. In ICML 2023 Workshop on Challenges of Deploying Generative AI, pp. .
12023Misc/Preprint Felix Friedrich, Manuel Brack, Dominik Hintersdorf, Lukas Struppek, Patrick Schramowski, Sasha Luccioni, Kristian Kersting (2023). Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness. arXiv preprint arXiv:2302.10893.
12023Conference Manuel Brack, Felix Friedrich, Dominik Hintersdorf, Lukas Struppek, Patrick Schramowski, Kristian Kersting (2023). SEGA: Instructing Text-to-Image Models using Semantic Guidance. In Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS).
12023Conference Patrick Schramowski, Manuel Brack, Björn Deiseroth, Kristian Kersting (2023). Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
02023Conference Felix Friedrich, David Steinmann, Kristian Kersting (2023). One explanation does not fit XIL. In Proceedings of the International Conference on Representation Learning (ICLR), Tiny Paper.
02023Journal Felix Friedrich, Wolfgang Stammer, Patrick Schramowski, Kristian Kersting (2023). A typology for exploring the mitigation of shortcut behaviour. Nature Machine Intelligence, 5, pp. 319-330.
02023Conference Manuel Brack, Patrick Schramowski, Björn Deiseroth, Kristian Kersting (2023). ILLUME: Rationalizing Vision-Language Models through Human Interactions. In Proceedings of the 40th International Conference on Machine Learning (ICML).
02023Journal Lukas Struppek, Dominik Hintersdorf, Felix Friedrich, Manuel Brack, Patrick Schramowski, Kristian Kersting (2023). Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis. Journal of Artificial Intelligence Research (JAIR), missing pp. .
02023Misc/Preprint Lukas Struppek, Martin B. Hentschel, Clifton Poth, Dominik Hintersdorf, Kristian Kersting (2023). Leveraging Diffusion-Based Image Variations for Robust Training on Poisoned Data. arXiv preprint arXiv:2310.06372.
02023In Collection Dominik Hintersdorf, Lukas Struppek, Daniel Neider, Kristian Kersting (2023). Defending Our Privacy With Backdoors. In NeurIPS 2023 Workshop on Backdoors in Deep Learning - The Good, the Bad, and the Ugly, pp. .
02023In Collection Manuel Brack, Patrick Schramowski, Kristian Kersting (2023). Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge. In Working Notes of the AACL Workshop on the ART of Safety (ARTS): Workshop on Adversarial testing and Red-Teaming for generative AI, pp. .
02023Conference Lukas Struppek, Dominik Hintersdorf, Kristian Kersting (2023). Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis. In Proceedings of the 19th IEEE/CVF International Conference on Computer Vision (ICCV).
02023Conference Björn Deiseroth, Mayukh Deb, Samuel Weinbach, Manuel Brack, Patrick Schramowski, Kristian Kersting (2023). AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation. In Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS).
02023Misc/Preprint Lukas Struppek, Dominik Hintersdorf, Felix Friedrich, Manuel Brack, Patrick Schramowski, Kristian Kersting (2023). Image Classifiers Leak Sensitive Attributes About Their Classes. arXiv preprint arXiv:2303.09289.
02022Conference Lukas Struppek, Dominik Hintersdorf, Daniel Neider, Kristian Kersting (2022). Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash. In Proceedings of the ACM Conference on Fairness, Accountability, and Transparency (FAccT).
02022Conference Dominik Hintersdorf, Lukas Struppek, Kristian Kersting (2022). To Trust or Not To Trust Prediction Scores for Membership Inference Attacks. In Proceedings of the 31st International Joint Conference on Artificial Intelligence and the 25th European Conference on Artificial Intelligence (IJCAI-ECAI).
02022Conference Lukas Struppek, Dominik Hintersdorf, Antonio De Almeida Correia, Antonia Adler, Kristian Kersting (2022). Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks. In Proceedings of the 39th International Conference on Machine Learning (ICML).
02022Journal Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting (2022). Large pre-trained language models contain human-like biases of what is right and wrong to do. Nature Machine Intelligence, 4(3), pp. 258-268.
02022Conference Felix Friedrich, Patrick Schramowski, Christopher Tauchmann, Kristian Kersting (2022). Interactively Providing Explanations for Transformer Language Models. In Proceedings of the 1st Conference of Hybrid Human Artificial Intelligence (HHAI) and in Frontiers in Artificial Intelligence and Applications.
02022Misc/Preprint Dominik Hintersdorf, Lukas Struppek, Manuel Brack, Felix Friedrich, Patrick Schramowski, Kristian Kersting (2022). Does CLIP Know My Face?. arXiv preprint arXiv:2209.07341.
02022Misc/Preprint Manuel Brack, Patrick Schramowski, Felix Friedrich, Dominik Hintersdorf, Kristian Kersting (2022). The Stable Artist: Steering Semantics in Diffusion Latent Space. arXiv preprint arXiv:2212.06013.
Showing 1 to 23 of 23 entries