Skip to content

Commit

Permalink
Imagebind demo update (lancedb#150)
Browse files Browse the repository at this point in the history
  • Loading branch information
raghavdixit99 authored Mar 2, 2024
1 parent 5a77fd1 commit 61db072
Show file tree
Hide file tree
Showing 6 changed files with 42 additions and 185 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,10 @@ package-lock.json
**.pkl
pandas_docs
**.lance
.venv
venv
__pycache__
*__pycache__*
*.DS_Store*

# multi-modal search example
examples/multimodal_video_search/videos
Expand Down
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,8 @@ If you're looking for in-depth tutorial-like examples, checkout the [tutorials](
| [Facial Recognition](./examples/facial_recognition) | <a href="https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/facial_recognition/main.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a> |
| [Accelerate Vector Search Applications Using OpenVINO](/tutorials/Accelerate-Vector-Search-Applications-Using-OpenVINO/) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/tutorials/Accelerate-Vector-Search-Applications-Using-OpenVINO/clip_text_image_search.ipynb) [![local LLM](https://img.shields.io/badge/local-llm-green)](#)| [![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://blog.lancedb.com/accelerate-vector-search-applications-using-openvino-51366eabf866)|
| [Search Within Images](/examples/Contextual-Compression-with-RAG/) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/search-within-images-with-sam-and-clip/main.ipynb) [![local LLM](https://img.shields.io/badge/local-llm-green)](#) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://blog.lancedb.com/search-within-an-image-331b54e4285e)|
| [Contextual-Compression-with-RAG](/examples/Contextual-Compression-with-RAG/) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/Contextual-Compression-with-RAG/main.ipynb) [![local LLM](https://img.shields.io/badge/local-llm-green)](#) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/etoai/enhance-rag-integrate-contextual-compression-and-filtering-for-precision-a29d4a810301)


| [Contextual-Compression-with-RAG](/examples/Contextual-Compression-with-RAG/) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/Contextual-Compression-with-RAG/main.ipynb) [![local LLM](https://img.shields.io/badge/local-llm-green)](#) |[![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://medium.com/etoai/enhance-rag-integrate-contextual-compression-and-filtering-for-precision-a29d4a810301) |
| [Imagebind demo app](/examples/imagebind_demo/) | <a href="https://huggingface.co/spaces/raghavd99/imagebind2"><img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo-with-title.svg" alt="hf spaces" style="width: 5.5rem; vertical-align: middle; background-color: white;"></a>|



Expand Down
4 changes: 3 additions & 1 deletion examples/imagebind_demo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@
A gradio app showcasing multi-modal capabilities of Imagebind supported via lanceDB API

## Usage
you can run it locally by cloning the project as mentioned below, or access via Colab - <a href="https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/imagebind_demo/main.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>
you can run it locally by cloning the project as mentioned below, or access via Spaces: <a href="https://huggingface.co/spaces/raghavd99/imagebind2">
<img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo-with-title.svg" alt="hf spaces" style="width: 5.75rem; vertical-align: middle; background-color: white;">
</a>

```bash
git clone https://github.com/lancedb/vectordb-recipes.git
Expand Down
132 changes: 28 additions & 104 deletions examples/imagebind_demo/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
from lancedb.embeddings import get_registry
from lancedb.pydantic import LanceModel, Vector
import gradio as gr
from downloader import dowload_and_save_audio, dowload_and_save_image
from downloader import dowload_and_save_audio, dowload_and_save_image, base_path

model = get_registry().get("imagebind").create()

Expand All @@ -15,7 +15,7 @@ class TextModel(LanceModel):
vector: Vector(model.ndims()) = model.VectorField()


text_list = ["A bird", "A dragon", "A car"]
text_list = ["A bird", "A dragon", "A car", "A guitar", "A witch", "Thunder"]
image_paths = dowload_and_save_image()
audio_paths = dowload_and_save_audio()

Expand Down Expand Up @@ -60,107 +60,31 @@ def process_audio(inp_audio) -> str:
return actual.image_uri, actual.text


css = """
output-audio, output-text {
display: None
}
img {
# width: 500px;
# height: 450px;
margin-left: auto;
margin-right: auto;
object-fit: cover;
"""
with gr.Blocks(css=css) as app:
# Using Markdown for custom CSS (optional)
with gr.Tab("Image to Text and Audio"):
with gr.Row():
with gr.Column():
inp1 = gr.Image(
value=image_paths[0],
type="filepath",
elem_id="img",
interactive=False,
)
output_audio1 = gr.Audio(label="Output Audio", elem_id="output-audio")
output_text1 = gr.Textbox(label="Output Text", elem_id="output-text")
btn_img1 = gr.Button("Retrieve")

# output_audio1 = gr.Audio(label="Output Audio 1", elem_id="output-audio1")
with gr.Column():
inp2 = gr.Image(
value=image_paths[1],
type="filepath",
elem_id="img",
interactive=False,
)
output_audio2 = gr.Audio(label="Output Audio", elem_id="output-audio")
output_text2 = gr.Textbox(label="Output Text", elem_id="output-text")
btn_img2 = gr.Button("Retrieve")

with gr.Column():
inp3 = gr.Image(
value=image_paths[2],
type="filepath",
elem_id="img",
interactive=False,
)
output_audio3 = gr.Audio(label="Output Audio", elem_id="output-audio")
output_text3 = gr.Textbox(label="Output Text", elem_id="output-text")
btn_img3 = gr.Button("Retrieve")

with gr.Tab("Text to Image and Audio"):
with gr.Row():
with gr.Column():
input_txt1 = gr.Textbox(label="Enter a prompt:", elem_id="output-text")
output_audio4 = gr.Audio(label="Output Audio", elem_id="output-audio")
output_img1 = gr.Image(type="filepath", elem_id="img")

with gr.Tab("Audio to Image and Text"):
with gr.Row():
with gr.Column():
inp_audio1 = gr.Audio(
value=audio_paths[0], type="filepath", interactive=False
)
output_img7 = gr.Image(type="filepath", elem_id="img")
output_text7 = gr.Textbox(label="Output Text", elem_id="output-text")
btn_audio1 = gr.Button("Retrieve")

with gr.Column():
inp_audio2 = gr.Audio(
value=audio_paths[1], type="filepath", interactive=False
)
output_img8 = gr.Image(type="filepath", elem_id="img")
output_text8 = gr.Textbox(label="Output Text", elem_id="output-text")
btn_audio2 = gr.Button("Retrieve")

with gr.Column():
inp_audio3 = gr.Audio(
value=audio_paths[2], type="filepath", interactive=False
)
output_img9 = gr.Image(type="filepath", elem_id="img")
output_text9 = gr.Textbox(label="Output Text", elem_id="output-text")
btn_audio3 = gr.Button("Retrieve")

# Click actions for buttons/Textboxes
btn_img1.click(process_image, inputs=[inp1], outputs=[output_text1, output_audio1])
btn_img2.click(process_image, inputs=[inp2], outputs=[output_text2, output_audio2])
btn_img3.click(process_image, inputs=[inp3], outputs=[output_text3, output_audio3])

input_txt1.submit(
process_text, inputs=[input_txt1], outputs=[output_img1, output_audio4]
)

btn_audio1.click(
process_audio, inputs=[inp_audio1], outputs=[output_img7, output_text7]
)
btn_audio2.click(
process_audio, inputs=[inp_audio2], outputs=[output_img8, output_text8]
)
btn_audio3.click(
process_audio, inputs=[inp_audio3], outputs=[output_img9, output_text9]
)
im_to_at = gr.Interface(
process_image,
gr.Image(type="filepath", value=image_paths[0]),
[gr.Text(label="Output Text"), gr.Audio(label="Output Audio")],
examples=image_paths,
allow_flagging="never",
)
txt_to_ia = gr.Interface(
process_text,
gr.Textbox(label="Enter a prompt:"),
[gr.Image(label="Output Image"), gr.Audio(label="Output Audio")],
allow_flagging="never",
examples=text_list,
)
a_to_it = gr.Interface(
process_audio,
gr.Audio(type="filepath", value=audio_paths[0]),
[gr.Image(label="Output Image"), gr.Text(label="Output Text")],
examples=audio_paths,
allow_flagging="never",
)
demo = gr.TabbedInterface(
[im_to_at, txt_to_ia, a_to_it],
["Image to Text/Audio", "Text to Image/Audio", "Audio to Image/Text"],
)

if __name__ == "__main__":
app.launch(share=True, allowed_paths=["./test_inputs/"])
demo.launch(share=True, allowed_paths=[f"{base_path}/test_inputs/"])
6 changes: 6 additions & 0 deletions examples/imagebind_demo/downloader.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,17 @@
"https://github.com/raghavdixit99/assets/raw/main/bird_audio.wav",
"https://github.com/raghavdixit99/assets/raw/main/dragon-growl-37570.wav",
"https://github.com/raghavdixit99/assets/raw/main/car_audio.wav",
"https://github.com/raghavdixit99/assets/raw/main/acoustic-guitar.wav",
"https://github.com/raghavdixit99/assets/raw/main/witch-103635.wav",
"https://github.com/raghavdixit99/assets/raw/main/thunder-25689.wav",
]
image_urls = [
"https://github.com/raghavdixit99/assets/assets/34462078/abf47cc4-d979-4aaa-83be-53a2115bf318",
"https://github.com/raghavdixit99/assets/assets/34462078/93be928e-522b-4e37-889d-d4efd54b2112",
"https://github.com/raghavdixit99/assets/assets/34462078/025deaff-632a-4829-a86c-3de6e326402f",
"https://github.com/raghavdixit99/assets/assets/34462078/a20bff32-155c-4bad-acf1-97856c493099",
"https://github.com/raghavdixit99/assets/assets/34462078/4f7dadd8-b38c-4c14-ac8a-5a2e74414f6a",
"https://github.com/raghavdixit99/assets/assets/34462078/ac11eeab-7b2b-4db3-981b-d5fed08d9bc2",
]

base_path = os.path.dirname(os.path.abspath(__file__))
Expand Down
76 changes: 0 additions & 76 deletions examples/imagebind_demo/main.ipynb

This file was deleted.

0 comments on commit 61db072

Please sign in to comment.