Skip to content

Commit

Permalink
[Tasks] Add Llama-3.2-11B-Vision-Instruct as the first recommended …
Browse files Browse the repository at this point in the history
…model for `image-text-to-text` (#956)

This PR is a proposition to add
`meta-llama/Llama-3.2-11B-Vision-Instruct` as the first recommended
model for
[`image-text-to-text`](https://huggingface.co/tasks/image-text-to-text)
task.
This would be nice to have, as we also want to show in the [Inference
API documentation](https://huggingface.co/docs/api-inference/index) how
to perform inference with this model, since it's both supported by TGI
and it's a popular conversational VLM.

Feel free to suggest a better description or another ranking for this
model.

cc @merveenoyan @osanseviero
  • Loading branch information
hanouticelina authored Oct 8, 2024
1 parent 44095da commit 161d341
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions packages/tasks/src/tasks/image-text-to-text/data.ts
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ const taskData: TaskDataCustom = {
metrics: [],
models: [
{
description: "Cutting-edge vision language model that can take multiple image inputs.",
id: "facebook/chameleon-7b",
description: "Powerful vision language model with great visual understanding and reasoning capabilities.",
id: "meta-llama/Llama-3.2-11B-Vision-Instruct",
},
{
description: "Cutting-edge conversational vision language model that can take multiple image inputs.",
Expand Down

0 comments on commit 161d341

Please sign in to comment.