[llama-mm] Onboard Llama3.2 mm vision encoder #6653

larryliu0820 · 2024-11-05T01:26:47Z

Stack from ghstack (oldest at bottom):

Summary: Add llama3.2 mm vision encoder to examples/models.

We need to do a module swapping for TilePositionEmbedding to make sure
vision encoder is exportable.

Test Plan: Unit tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: Add llama3.2 mm vision encoder to examples/models. We need to do a module swapping for TilePositionEmbedding to make sure vision encoder is exportable. Test Plan: Unit tests. Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

pytorch-bot · 2024-11-05T01:26:50Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6653

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

GLIBC not found in Nova workflows

❌ 7 New Failures

As of commit c533944 with merge base 3a1f8d2 ():

NEW FAILURES - The following jobs have failed:

Propose to merge ghstack orig PRs to main / Try to create a PR with ghstack /orig branch (gh)
github.GithubException.GithubException: 404 {"message": "Branch not found", "documentation_url": "https://docs.github.com/rest/branches/branches#get-a-branch", "status": "404"}
pull / test-binary-size-linux-gcc / linux-job (gh)
RuntimeError: Command docker exec -t a6219dc9053c9f0abd401b8bfd484a66a0a8fd3062b2d5aa616ba8a6c95923d5 /exec failed with exit code 1
pull / unittest / linux / linux-job (gh)
RuntimeError: Command docker exec -t 8512ac2daf656325bdc8ff8fd51057bb2b185c80cbde6b64c58d631edca46896 /exec failed with exit code 5
pull / unittest / macos / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 5
pull / unittest-arm / linux-job (gh)
RuntimeError: Command docker exec -t 7456a16a3f586a2e5c6126ef3f1043326a9f3632602ba9649373186d642b06b4 /exec failed with exit code 1
trunk / test-arm-backend-delegation / linux-job (gh)
RuntimeError: Command docker exec -t a7f687f0147dafeda34c2fe85fce86920e7a37ce6aef637278d7178b4eb5f160 /exec failed with exit code 1
trunk / test-arm-reference-delegation / linux-job (gh)
RuntimeError: Command docker exec -t 5d4b07b5049f2ee56e3c97a1ccf18af827393e6bbeba82003d63d955361e52db /exec failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: Add llama3.2 mm vision encoder to examples/models. We need to do a module swapping for TilePositionEmbedding to make sure vision encoder is exportable. Test Plan: Unit tests. Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 4c2a30e6d6b5932a972c34778fea8b3152372e58 Pull Request resolved: #6653

Summary: Add llama3.2 mm vision encoder to examples/models. We need to do a module swapping for TilePositionEmbedding to make sure vision encoder is exportable. Test Plan: Unit tests. Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Add llama3.2 mm vision encoder to examples/models. We need to do a module swapping for TilePositionEmbedding to make sure vision encoder is exportable. Test Plan: Unit tests. Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: f996a614d7ec1b4c7fa4c1cb20bdcc02d813771c Pull Request resolved: #6653

tarun292 · 2024-11-14T01:46:14Z

examples/models/llama3_2_vision/vision_encoder/model.py

+from executorch.extension.llm.modules._position_embeddings import (
+    replace_tile_positional_embedding,
+)
+from torchtune.models.flamingo._component_builders import flamingo_vision_encoder


Need to change to:

from torchtune.models.llama3_2_vision._component_builders import llama3_2_vision_encoder

larryliu0820 · 2024-11-14T02:38:28Z

Close in favor of #6831

[llama-mm] Onboard Llama3.2 mm vision encoder

2e77105

Summary: Add llama3.2 mm vision encoder to examples/models. We need to do a module swapping for TilePositionEmbedding to make sure vision encoder is exportable. Test Plan: Unit tests. Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

larryliu0820 mentioned this pull request Nov 5, 2024

[llama-mm] Add export-friendly tile position embedding #6650

Merged

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 5, 2024

tarun292 approved these changes Nov 5, 2024

View reviewed changes

tarun292 reviewed Nov 14, 2024

View reviewed changes

larryliu0820 closed this Nov 14, 2024

larryliu0820 had a problem deploying to cherry-pick-bot November 14, 2024 02:38 — with GitHub Actions Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[llama-mm] Onboard Llama3.2 mm vision encoder #6653

[llama-mm] Onboard Llama3.2 mm vision encoder #6653

larryliu0820 commented Nov 5, 2024 •

edited

Loading

pytorch-bot bot commented Nov 5, 2024 •

edited

Loading

tarun292 Nov 14, 2024

larryliu0820 commented Nov 14, 2024

[llama-mm] Onboard Llama3.2 mm vision encoder #6653

[llama-mm] Onboard Llama3.2 mm vision encoder #6653

Conversation

larryliu0820 commented Nov 5, 2024 • edited Loading

pytorch-bot bot commented Nov 5, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6653

❗ 1 Active SEVs

❌ 7 New Failures

tarun292 Nov 14, 2024

Choose a reason for hiding this comment

larryliu0820 commented Nov 14, 2024

larryliu0820 commented Nov 5, 2024 •

edited

Loading

pytorch-bot bot commented Nov 5, 2024 •

edited

Loading