diff --git a/index.html b/index.html index c16f51e..d1dacde 100644 --- a/index.html +++ b/index.html @@ -246,8 +246,9 @@

Evaluation

+

In the following section we provide a concise overview of the quantitative and qualitative evaluation of MultiFusion.

Image Fidelity and Text-to-Image Alignment

-

We meassure image fidelity and image-text-alignment using the standard metrics FID-30K and Clip Scores. We find that MultiFusion prompted with text only performs on par with Stable Diffusion despite extension of the Encoder to support multiple languages and modalities.

+

First we meassure image fidelity and image-text-alignment using the standard metrics FID-30K and Clip Scores. We find that MultiFusion prompted with text only performs on par with Stable Diffusion despite extension of the Encoder to support multiple languages and modalities.


Compositional Robustness

@@ -255,7 +256,8 @@

Compositional Robustness

method
-

Image Composition is a known limitation of Diffusion Models. Through evaluation of our new benchmark MCC-250 we show that multimodal prompting leads to more compositional robustness as judged by humans.

+

Image Composition is a known limitation of Diffusion Models. Through evaluation of our new benchmark MCC-250 we show that multimodal prompting leads to more compositional robustness as judged by humans. Each prompt is a complex conjunction of two different objects with different + colors, with multimodal prompts containing one visual reference for each object interleaved with thetext input.

Multilinguality