Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] 2.6版本默认绑定flash_atten,无法取消,并且目前并没有提供对应flash_att的版本和安装示例。 #429

Closed
2 tasks done
kaixindelele opened this issue Aug 8, 2024 · 11 comments

Comments

@kaixindelele
Copy link

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

  • 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

  • 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

ImportError: This modeling file requires the following packages that were not found in your environment: flash_attn. Run `pip install flash_attn`

demo2.6会有这样的报错

期望行为 | Expected Behavior

期望能够提供取消对flash的绑定,或者提供一个安装教程,以及版本的指定。

复现方法 | Steps To Reproduce

2.6的demo例程。Linux系统。

运行环境 | Environment

Python: 3.10
Transformers: 4.40.0
PyTorch: 2.4.0+cu121
CUDA: 12.2

备注 | Anything else?

No response

@adlifuad-asc
Copy link

I dont understand Chinese but I think we have a similar problem about the flash_attn

I use python 3.11

and install the libraries by order:

numpy==1.24.3
Pillow==10.1.0
torch==2.1.2
torchvision==0.16.2
transformers==4.40.0
sentencepiece==0.1.99
accelerate==0.30.1
bitsandbytes==0.43.1
flash_attn

@HongLouyemeng
Copy link

HongLouyemeng commented Aug 9, 2024

同学,问题有解决方案吗,我也卡在这里了。我windows,cu117和torch2.1.0没找到对应的flash_atten的包

@kaixindelele
Copy link
Author

哈哈,我放弃Windows了,换了一台台式机一把成功:
RTX3090显卡
Ubuntu20.04
Driver Version: 545.23.08
CUDA Version: 12.3
transformers 4.40.0
torch 2.4.0+cu124
torchaudio 2.4.0+cu124
flash-attn 2.6.3
flash安装一把过~~

@HongLouyemeng
Copy link

哈哈,我放弃Windows了,换了一台台式机一把成功: RTX3090显卡 Ubuntu20.04 Driver Version: 545.23.08 CUDA Version: 12.3 transformers 4.40.0 torch 2.4.0+cu124 torchaudio 2.4.0+cu124 flash-attn 2.6.3 flash安装一把过~~

woc,太难了。我貌似没linux

@HongLouyemeng
Copy link

用windows的whl文件和对应的cuda、cudnn、torch版本即可

@Anionex
Copy link

Anionex commented Aug 9, 2024

尝试一下安装flash-attn==1.0.4,我这边可以

@BothSavage
Copy link

BothSavage commented Aug 9, 2024

在mac下已经解决了,输出博客:https://bothsavage.github.io/article/240810-minicpm2.6

提交pr:#461

修改web_demo_2.6.py文件

# fix the imports
def fixed_get_imports(filename: Union[str, os.PathLike]) -> list[str]:
    imports = get_imports(filename)
    if not torch.cuda.is_available() and "flash_attn" in imports:
        imports.remove("flash_attn")
    return imports

。。。。。。。。。。

with patch("transformers.dynamic_module_utils.get_imports", fixed_get_imports):
    model = AutoModel.from_pretrained(model_path, trust_remote_code=True, torch_dtype=torch.dtype)
    model = model.to(device=device)

@robertio
Copy link

robertio commented Aug 10, 2024

Hi
Solution for me:

  1. removed some packages required versions from requirements.txt
    and dont forget to remove comment from flash_attn. add bitsandbytes
    spacy gradio torch torchvision bitsandbytes flash_attn

  2. pip install --upgrade pip setuptools wheel # this will solve the torch <-> flash_attn wheel issue.
    pip install -r requirements.txt
    Install went fine.

  3. pip install torch torchvision --upgrade # needed for me.

this shows: Successfully installed nvidia-cudnn-cu12-9.1.0.70 nvidia-nccl-cu12-2.20.5 torch-2.4.0 torchvision-0.19.0 triton-3.0.0

python web_demo_2.6.py --device cuda

Now webdemo is start to run but got CUDA out of memory. :) (i've 12GB VRAM RTX3060) but this is another problem.

@xudawu201
Copy link

xudawu201 commented Aug 11, 2024

可以把flash atten包直接删了也可以运行,我看到有写其他备选运行方式

@vagetablechicken
Copy link

I catched the import error, it may like /home/xxx/miniconda3/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE, ref Dao-AILab/flash-attention#919

So I use the flash-attn==2.5.8 when I have torch==2.3.0.

@teneous
Copy link

teneous commented Aug 13, 2024

在mac下已经解决了,输出博客:https://bothsavage.github.io/article/240810-minicpm2.6

提交pr:#461

修改web_demo_2.6.py文件

# fix the imports
def fixed_get_imports(filename: Union[str, os.PathLike]) -> list[str]:
    imports = get_imports(filename)
    if not torch.cuda.is_available() and "flash_attn" in imports:
        imports.remove("flash_attn")
    return imports

。。。。。。。。。。

with patch("transformers.dynamic_module_utils.get_imports", fixed_get_imports):
    model = AutoModel.from_pretrained(model_path, trust_remote_code=True, torch_dtype=torch.dtype)
    model = model.to(device=device)

It's very useful , thx bro.

@Cuiunbo Cuiunbo closed this as completed Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants