From 914bc409b0e1258db3980ff881a9d4beb6403fbd Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Thu, 21 Dec 2023 00:44:36 -0800
Subject: [PATCH] Bump transformers from 4.30.0 to 4.36.0 in /tools/ci_build
(#18895)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Bumps [transformers](https://github.com/huggingface/transformers) from
4.30.0 to 4.36.0.
Sourced from transformers's
releases. Mixtral is the new open-source model from Mistral AI announced by the
blogpost Mixtral
of Experts. The model has been proven to have comparable
capabilities to Chat-GPT according to the benchmark results shared on
the release blogpost. The architecture is a sparse Mixture of Experts with Top-2 routing
strategy, similar as >>> model =
AutoModelForCausalLM.from_pretrained("mistralai/Mixtral-8x7B",
torch_dtype=torch.float16, device_map="auto")
>>> tokenizer =
AutoTokenizer.from_pretrained("mistralai/Mistral-8x7B") >>> prompt = "My favourite condiment is" >>> model_inputs = tokenizer([prompt],
return_tensors="pt").to(device)
>>> model.to(device) >>> generated_ids = model.generate(**model_inputs,
max_new_tokens=100, do_sample=True)
>>> tokenizer.batch_decode(generated_ids)[0]
Release notes
v4.36: Mixtral, Llava/BakLlava, SeamlessM4T v2, AMD ROCm, F.sdpa
wide-spread support
New model additions
Mixtral
NllbMoe
architecture in transformers.
You can use it through AutoModelForCausalLM
interface:>>> import torch
>>> from transformers import AutoModelForCausalLM,
AutoTokenizer
The model is compatible with existing optimisation tools such Flash
Attention 2, bitsandbytes
and PEFT library. The checkpoints
are release under mistralai
organisation on the Hugging Face Hub.
Llava is an open-source chatbot trained by fine-tuning LlamA/Vicuna on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture. In other words, it is an multi-modal version of LLMs fine-tuned for chat / instructions.
The Llava model was proposed in Improved Baselines with Visual Instruction Tuning by Haotian Liu, Chunyuan Li, Yuheng Li and Yong Jae Lee.
Llava
] Add Llava to transformers by @younesbelkada
in #27662@NielsRogge
in #27895The integration also includes BakLlava
which is a Llava model trained with Mistral backbone.
The mode is compatible with "image-to-text"
pipeline:
from transformers import pipeline
from PIL import Image
import requests
model_id = "llava-hf/llava-1.5-7b-hf"
</tr></table>
... (truncated)
1466677
Release: v4.36.0accccdd
[Add Mixtral
] Adds support for the Mixtral MoE (#27942)0676d99
[from_pretrained
] Make from_pretrained fast again (#27709)9f18cc6
Fix SDPA dispatch & make SDPA CI compatible with torch<2.1.1 (#27940)7ea21f1
[LLaVa] Some improvements (#27895)5e620a9
Fix SeamlessM4Tv2ModelIntegrationTest
(#27911)e96c1de
Skip UnivNetModelTest::test_multi_gpu_data_parallel_forward
(#27912)8d8970e
[BEiT] Fix test (#27934)235be08
[DETA] fix backbone freeze/unfreeze function (#27843)df5c5c6
Fix typo (#27918)