Divide & Bind Your Attention
for Improved
Generative Semantic Nursing
Yumeng Li1,2 Margret Keuper2,3 Dan Zhang1,4 Anna Khoreva1,4
1Bosch Center for AI 2Siegen University 3MPI for Informatics 4Tübingen University
BMVC 2023 Oral
Our Divide & Bind can significantly improve a pretrained text-to-image model, faithfully generate multiple objects based on detailed textual description. Compared to prior state-of-the-art semantic nursing technique for text-to-image synthesis, Attend & Excite, our approach exhibits superior alignment with the input prompt and maintain a higher level of realism.
Method Overview
We perform latent optimization on-the-fly based on the attention maps, without fine-tuning the pretrained text-to-image model. We propose two novel loss terms: (1) Total-variation based attendance loss (2) Jensen–Shannon divergence based binding loss.
More Results
(Best view on laptop)
Cross Attention Visualization
Divide for Attendance
With more complex prompts, the competition between tokens becomes more severe. We propose to maximize total variation(TV) of the object tokens to foster the stimulation of multiple excitation, which reduces the risk of conflicts with the other tokens.
"A dog and a turtle on the street, snowy scene"
Stable Diffusion
Attend & Excite
Divide &Bind (Ours)
"A pineapple and two oranges"
Stable Diffusion
Attend & Excite
Divide &Bind (Ours)
Attribute Binding Regularization
We explicitly minimize JS divergence between the attention maps of the object token and its attribute token. By applying our binding loss, the attribute attention map is more localized.
Combined with ControlNet
We combined Divide & Bind with conditional text-to-image model, i.e., ControlNet, which introduces additional condition, e.g., semantic label map. By applying our optimization, the generated images can better follow the conditional inputs.
BibTex
@inproceedings{Li2023divide,
title={Divide \& Bind Your Attention for Improved Generative Semantic Nursing},
author={Li, Yumeng and Keuper, Margret and Zhang, Dan and Khoreva, Anna},
booktitle={34th British Machine Vision Conference 2023, {BMVC} 2023},
year={2023}
}