diff --git a/assets/images/background.png b/assets/images/background.png new file mode 100644 index 0000000..8bc5880 Binary files /dev/null and b/assets/images/background.png differ diff --git a/index.html b/index.html index 2e04cba..7ead46f 100644 --- a/index.html +++ b/index.html @@ -90,6 +90,18 @@

Introducing MyVLM

+
+
+

Background

+

+ LLMs offer users intuitive interfaces for interacting with textual information. The integration of vision into LLMs through VLMs has enabled these models to "see" and reason over visual content. However, these VLMs possess generic knowledge, lacking a personal touch. With MyVLM we equip these models with the ability to comprehend user-specific concepts, tailoring the model specifically to you. MyVLM allows users to obtain personalized responses where outputs are no longer generic, but focus on communicating information about the target subject to the user. +

+ + + +
+
+
@@ -97,22 +109,22 @@

Introducing MyVLM

"<you>, wearing sunglasses and a yellow strap, standing on a bustling street in a colorful city."
-
+
-

The Vision Language Models

-

We apply MyVLM to various VLM architectures for personalized captioning, visual question-answering, and referring expression comprehension.

+

The Vision Language Models

+

We apply MyVLM to various VLM architectures for personalized captioning, visual question-answering, and referring expression comprehension.

-

BLIP-2

+

BLIP-2

-

LLaVA 1.6

+

LLaVA 1.6

-

MiniGPT-v2

+

MiniGPT-v2