Instructing and Evaluating Generative Models

In this ongoing research effort our inter-organizational team based in Darmstadt, Germany is investigating the strengths and weaknesses of large-scale generative modes. Lately, our work has focused on generative image models: Evaluating their biases and limitations, devising methods for reliably instructing these models and subsequently mitigate the underlying problems.

Projects

Methods

Instructing text-to-image

SEGA: Instructing Diffusion using Semantic Dimensions

We present Semantic Guidance (SEGA) to enable fine-grained instruction of text-to-image models. (SEGA) allows for subtle and extensive edits, changes in composition and style, as well as optimizing the overall artistic conception.

Multi-Modal, Multi-Lingual Generation

MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation

We propose MultiFusion that allows one to express complex and nuanced concepts with arbitrarily interleaved inputs of multiple modalities and languages. MutliFusion leverages pre-trained models and aligns them for integration into a cohesive system, thereby avoiding the need for extensive training from scratch.

Real Image Editing

LEDITS++: Limitless Image Editing using Text-to-Image Models

We propose LEDITS++ a novel method for textual editing of images using diffusion models. LEDITS++ is architecture agnostic, computationally efficient, supports versatile edis and limits changes to the relevant image regions.

Multi-Modal Content Moderation

LlavaGuard: Leveraging VLMs for Multi-Modal Content Moderation Image Generation

We propose LlavaGuard that allows conduct safety analyis of vision datasets and generative models. To this end, we use a taxnomy that can be adjusted flexibly. LlavaGuard is architecture agnostic and can be used with any generative model.

Responsible AI

Mitigating Inappropriateness

Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models

Safe Latent Diffusion suppresses inappropriate degeneration of generative image models. Additionally, we establish a novel image generation test bed-inappropriate image prompts (I2P)-containing dedicated, real-world image-to-text prompts covering concepts such as nudity and violence.

Instructing on Fairness

Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness

We investigate biases of text-to-image models across all components of the pipeline. We propose Fair Diffusion for shifting a bias, based on human instructions, in any direction yielding arbitrarily new proportions for, e.g., identity groups

Large-scale Evaluation

Mitigating Inappropriateness in Image Generation: Can there be Value in Reflecting the World's Ugliness?

We demonstrate inappropriate degeneration on a large-scale for various generative text-to-image models, thus motivating the need for monitoring and moderating them at deployment. To this end, we evaluate mitigation strategies at inference to suppress the generation of inappropriate content.

People

Profile picture of Manuel Brack

Manuel is a PhD candidate at the German Research Center for AI (DFKI) and TU Darmstadt. In his research he focuses on human-centric AI in the context of large-scale generative models.

Manuel Brack

DFKI, TU Darmstadt
Profile picture of Björn Deiseroth

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Praesent urna diam, maximus ut ullamcorper quis, placerat id eros. Duis semper justo sed condimentum rutrum. Nunc tristique purus turpis. Maecenas vulputate.

Björn Deiseroth

Aleph Alpha, TU Darmstadt
Profile picture of Felix Friedrich

Felix is a PhD candidate at hessian.AI and TU Darmstadt. In his research he focuses on fairness and explainability in AI models integrating the human in the loop.

Felix Friedrich

TU Darmstadt, hessian.AI

Profile picture of Dominik Hintersdorf

Dominik is a PhD candidate at TU Darmstadt. In his research, he investigates security and privacy issues of deep learning systems in the context of multi-modal models.

Dominik Hintersdorf

TU Darmstadt
Profile picture of Patrick Schramowski

Patrick is a senior researcher at the German Research Center for AI (DFKI) and Hessian.ai. In his research he focuses on human-centric AI and AI alignment in the context of large-scale generative models.

Patrick Schramowski

DFKI, TU Darmstadt, hessian.AI
Profile picture of Lukas Struppek

Lukas is a PhD candidate at Darmstadt. In his research, he investigates security and privacy issues of deep learning systems in the context of generative models.

Lukas Struppek

Tu Darmstadt

Relevant Publications