Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] AttributeError: 'NoneType' object has no attribute 'set_moe' #6572

Open
zhanwenchen opened this issue Sep 25, 2024 · 0 comments
Open
Labels
bug Something isn't working inference

Comments

@zhanwenchen
Copy link

zhanwenchen commented Sep 25, 2024

Describe the bug
During inference, I got AttributeError: 'NoneType' object has no attribute 'set_moe'. I also noticed Policy type <class 'deepspeed.module_inject.containers.clip.HFCLIPLayerPolicy'> not supported as an INFO log line.

(multivu) zhanwen@zhanwen-mini:~/multivu$ bash scripts/eval.sh | tee log_eval.log
[2024-09-25 15:53:43,055] [INFO] [logging.py:96:log_dist] [Rank 0] Policy type <class 'deepspeed.module_inject.containers.clip.HFCLIPLayerPolicy'> not supported
[rank0]: Traceback (most recent call last):
[rank0]:   File "<frozen runpy>", line 198, in _run_module_as_main
[rank0]:   File "<frozen runpy>", line 88, in _run_code
[rank0]:   File "/home/zhanwen/multivu/tasks/eval/vcgbench/pllava_eval_vcgbench.py", line 220, in <module>
[rank0]:     main()
[rank0]:   File "/home/zhanwen/multivu/tasks/eval/vcgbench/pllava_eval_vcgbench.py", line 210, in main
[rank0]:     result_list = run(args=args, world_size=n_gpus, device_map=device_map, logging_level=logging_level)  # info
[rank0]:                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/multivu/tasks/eval/vcgbench/pllava_eval_vcgbench.py", line 112, in run
[rank0]:     model, processor, dataset = load_model_and_dataset(rank, world_size,
[rank0]:                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/multivu/tasks/eval/vcgbench/pllava_eval_vcgbench.py", line 26, in load_model_and_dataset
[rank0]:     model, processor = load_pllava(pretrained_model_name_or_path, pretrained_processor_name_or_path, num_frames, device_map, use_video_encoder=use_video_encoder, use_dic_queries=use_dic_queries, use_cross_attention=use_cross_attention, use_lora=use_lora, weight_dir=weight_dir, lora_alpha=lora_alpha, pooling_shape=pooling_shape)
[rank0]:                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank0]:     return func(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/multivu/tasks/eval/model_utils.py", line 187, in load_pllava
[rank0]:     ds_engine = init_inference(model,
[rank0]:                 ^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/__init__.py", line 364, in init_inference
[rank0]:     engine = InferenceEngine(model, config=ds_inference_config)
[rank0]:              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/inference/engine.py", line 160, in __init__
[rank0]:     self._apply_injection_policy(config)
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/inference/engine.py", line 421, in _apply_injection_policy
[rank0]:     replace_transformer_layer(client_module, self.module, checkpoint, config, self.config)
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/module_inject/replace_module.py", line 385, in replace_transformer_layer
[rank0]:     replaced_module = replace_module(model=model,
[rank0]:                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/module_inject/replace_module.py", line 634, in replace_module
[rank0]:     replaced_module, _ = _replace_module(model, policy, state_dict=sd)
[rank0]:                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/module_inject/replace_module.py", line 694, in _replace_module
[rank0]:     _, layer_id = _replace_module(child,
[rank0]:                   ^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/module_inject/replace_module.py", line 694, in _replace_module
[rank0]:     _, layer_id = _replace_module(child,
[rank0]:                   ^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/module_inject/replace_module.py", line 694, in _replace_module
[rank0]:     _, layer_id = _replace_module(child,
[rank0]:                   ^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   [Previous line repeated 1 more time]
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/module_inject/replace_module.py", line 670, in _replace_module
[rank0]:     replaced_module = policies[child.__class__][0](child,
[rank0]:                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/module_inject/replace_module.py", line 315, in replace_fn
[rank0]:     new_module = replace_with_policy(child,
[rank0]:                  ^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/module_inject/replace_module.py", line 228, in replace_with_policy
[rank0]:     _container.set_moe(moe)
[rank0]:     ^^^^^^^^^^^^^^^^^^
[rank0]: AttributeError: 'NoneType' object has no attribute 'set_moe'

To Reproduce

@torch_no_grad()
def load_pllava(repo_id, pretrained_processor_name_or_path, num_frames, device_map, use_video_encoder, use_dic_queries, use_cross_attention, use_lora=False, weight_dir=None, lora_alpha=32, use_multi_gpus=False, pooling_shape=(16, 12, 12)):
    # print("===============>pooling_shape", pooling_shape)
    if num_frames == 0:
        raise ValueError('load_pllava: num_frames == 0')
        # produce a bug if ever usen the pooling projector
    # dtype = torch_float16 # torch_bfloat16
    torch_dtype = torch_bfloat16 # torch_bfloat16
    logger.info(f'load_pllava: repo_id={repo_id}; weight_dir={weight_dir}')
    repo_path = weight_dir if use_lora else repo_id
    # Path(repo_id).parent
    config = PllavaConfig.from_pretrained(
        # repo_id,
        repo_id,
        use_flash_attention_2=is_gpu_ampere_or_later(),
        pooling_shape=pooling_shape,
        device_map=device_map,
        num_frames=num_frames,
        torch_dtype=torch_dtype,
    )
    config.use_video_encoder = use_video_encoder
    config.use_dic_queries = use_dic_queries
    config.use_cross_attention = use_cross_attention
    logger.info(f'load_pllava: initializing model from repo_id={repo_id}')
    # model = PllavaForConditionalGeneration.from_pretrained(repo_id, config=config, torch_dtype=torch_dtype)
    # model = PllavaForConditionalGeneration.from_pretrained('MODELS/pllava-13b_B/ckpt_epoch02/pytorch_model', config=config, torch_dtype=dtype)
    # model = PllavaForConditionalGeneration.from_pretrained('MODELS/pllava-13b_B', config=config, torch_dtype=dtype)

    # model = PllavaForConditionalGeneration.from_pretrained(repo_id, config=config, torch_dtype=dtype)
    # with init_empty_weights():
    model = PllavaForConditionalGeneration.from_pretrained(repo_id, config=config, torch_dtype=torch_dtype)
    model.eval()
    # model.to('cuda', non_blocking=True) # OOM for multi-gpu, in which case use accelerator.prepare_model(model)
    processor = PllavaProcessor.from_pretrained(repo_id, _attn_implementation="flash_attention_2")
    # processor = PllavaProcessor.from_pretrained('llava-hf/llava-v1.6-vicuna-13b-hf')
    if use_lora is True:
        # peft_config = LoraConfig(task_type=TaskType.CAUSAL_LM, inference_mode=True, target_modules=["q_proj", "v_proj"], r=128, lora_alpha=lora_alpha, lora_dropout=0.)
        peft_config = LoraConfig(task_type=TaskType.CAUSAL_LM, inference_mode=True, target_modules=["q_proj", "v_proj"], r=128, lora_alpha=lora_alpha, lora_dropout=0.)
        # if lora_dir is not None:
        #     model.language_model = PeftModel.from_pretrained(model.language_model, lora_dir, config=peft_config)
        # else:
        model.language_model = language_model = get_peft_model(model.language_model, peft_config)
        language_model.print_trainable_parameters()
        logger.info(f"Finish constructing lora with lora_alpha/128={lora_alpha/128}")

    logger.info(f"load_from_pretrained from {weight_dir}")
    assert weight_dir is not None

    # load weights
    if weight_dir is not None:
        state_dict = {}
        save_fnames = listdir(weight_dir)
        if "model.safetensors" in save_fnames:
            use_full = False
            for fn in save_fnames:
                if fn.startswith('model-0'):
                    use_full = True
                    break
        else:
            use_full = True

        if not use_full:
            print("Loading weight from", weight_dir, "model.safetensors")
            with safe_open(f"{weight_dir}/model.safetensors", framework="pt", device="cpu") as f:
                for k in f.keys():
                    state_dict[k] = f.get_tensor(k)
        else:
            print("Loading weight from", weight_dir)
            for fn in save_fnames:
                if fn.startswith('model-0'):
                    with safe_open(f"{weight_dir}/{fn}", framework="pt", device="cpu") as f:
                        for k in f.keys():
                            state_dict[k] = f.get_tensor(k)

        # if 'model' in state_dict.keys():
        if 'model' in state_dict:
            msg = model.load_state_dict(state_dict['model'], strict=False)
        else:
            msg = model.load_state_dict(state_dict, strict=False)
        print(msg)
    if weight_dir is not None:
        logger.info(f'Loading from pretrained_path: {weight_dir}')
        msg = load_from_pretrained(model, weight_dir, strict=not use_lora)
        if use_video_encoder is True:
            cfg_video_clip = {'arch': 'video_llama', 'image_size': 224, 'drop_path_rate': 0, 'use_grad_checkpoint': False, 'vit_precision': 'fp16', 'freeze_vit': True, 'freeze_qformer': True, 'num_query_token': 32, 'llama_model': 'ckpt/vicuna-7b/', 'prompt': '', 'model_type': 'pretrain_vicuna', 'ckpt': './models_video_clip/askvideos_clip_v0.2.pth', 'max_frame_pos': 32, 'clip_dim_size': 1024, 'num_videoq_hidden_layers': 4}
            # with GatheredParameters(list(module.parameters(recurse=False)), modifier_rank=0):
            #     if get_rank() == 0:
            #         module._load_from_state_dict(*args)
            module = model.model_video_clip if use_video_encoder else model
            if ckpt_path := cfg_video_clip['ckpt']:
                print(f"Load first Checkpoint: {ckpt_path}")
                with GatheredParameters(list(module.parameters(recurse=False)), modifier_rank=0), Accelerator().main_process_first():
                    # if get_rank() == 0:
                    url_or_filename='./models_video_clip/blip2_pretrained_flant5xxl.pth'
                    ckpt = torch_load(url_or_filename, map_location="cpu")
                    error_message = load_state_dict_into_model(module, ckpt['model'], '')
                    logger.info(error_message) # msg = model.model_video_clip.load_state_dict(ckpt['model'], strict=False)
                    # module.load_from_pretrained()

                    ckpt = torch_load(ckpt_path, map_location="cpu")
                    error_message = load_state_dict_into_model(module, ckpt['model'], '')
                    # msg = module.load_state_dict(ckpt['model'], strict=False)
                    # msg = module._load_from_state_dict(ckpt['model'], strict=False)
                    # msg = module._load_from_state_dict(ckpt['model'], strict=False)
                    logger.info(error_message) # msg = model.model_video_clip.load_state_dict(ckpt['model'], strict=False)

    from accelerate.utils import calculate_maximum_sizes, compute_module_sizes
    sizes = calculate_maximum_sizes(model)
    print(f'calculate_maximum_sizes: {sizes}')
    print(f'compute_module_sizes: {compute_module_sizes(model)}')
    # model = accelerator.prepare_model(model, device_placement=False, evaluation_mode=True)
    # Initialize the DeepSpeed-Inference engine
    from os import environ
    world_size = int(environ.get('WORLD_SIZE', '1'))
    ds_engine = init_inference(model,
                               tensor_parallel={"tp_size": world_size}, # if none, then oom.
                               dtype=torch_dtype,
                               replace_with_kernel_inject=True)
    model = ds_engine.module

    return model, processor

Expected behavior
DeepSpeed should not expose a NoneType error to the user.

ds_report output
Please run ds_report to give us details about your setup.
ds_report.log

Screenshots
If applicable, add screenshots to help explain your problem.
Attached log_eval.log

System info (please complete the following information):

  • OS: Ubuntu 24.04 LTS
  • GPU count and types: 2xGeForce 3090 Ti 24GB with NVLink
  • (if applicable) what DeepSpeed-MII version are you using: N/A
  • (if applicable) Hugging Face Transformers/Accelerate/etc. versions:
(multivu) zhanwen@zhanwen-mini:~/multivu$ pip list | grep -e transformers -e accelerate -e deepspeed -e torch
accelerate                     0.34.2
deepspeed                      0.15.1+10ba3dde
deepspeed-kernels              0.0.1.dev1698255861
torch                          2.4.0a0+gitee1b680
torchvision                    0.19.1a0+6194369
transformers                   4.44.2
  • Python version: Python 3.12.3 | packaged by Anaconda, Inc. | (main, May 6 2024, 19:46:43) [GCC 11.2.0] on linux
  • Any other relevant info about your setup

Docker context
Are you using a specific docker image that you can share?

Additional context
Add any other context about the problem here.

I built magma-cuda126 (2.6.1), torch (2.4.1), torchvision (0.19.1) and deepspeed (0.15.1) from source under the same environment with the following libraries built from source:

  • nccl (2.22.3)
  • cusparselt (0.6.2.3)

and the following libraries installed from apt:

(multivu) zhanwen@zhanwen-mini:~/multivu$ apt list --installed | grep -e cuda -e cudnn -e nccl -e cusparse 

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

cuda-cccl-12-6/unknown,now 12.6.37-1 amd64 [installed,automatic]
cuda-command-line-tools-12-6/unknown,now 12.6.1-1 amd64 [installed,automatic]
cuda-compiler-12-6/unknown,now 12.6.1-1 amd64 [installed,automatic]
cuda-crt-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-cudart-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-cudart-dev-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-cuobjdump-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-cupti-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-cupti-dev-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-cuxxfilt-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-documentation-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-driver-dev-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-drivers-560/unknown,now 560.35.03-1 amd64 [installed,automatic]
cuda-drivers-fabricmanager-560/unknown,now 560.35.03-1 amd64 [installed]
cuda-gdb-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-keyring/unknown,unknown,now 1.1-1 all [installed]
cuda-libraries-12-6/unknown,now 12.6.1-1 amd64 [installed,automatic]
cuda-libraries-dev-12-6/unknown,now 12.6.1-1 amd64 [installed,automatic]
cuda-nsight-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nsight-compute-12-6/unknown,now 12.6.1-1 amd64 [installed,automatic]
cuda-nsight-systems-12-6/unknown,now 12.6.1-1 amd64 [installed,automatic]
cuda-nvcc-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvdisasm-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvml-dev-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvprof-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvprune-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvrtc-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvrtc-dev-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvtx-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvvm-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvvp-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-opencl-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-opencl-dev-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-profiler-api-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-sanitizer-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-toolkit-12-6-config-common/unknown,now 12.6.68-1 all [installed,automatic]
cuda-toolkit-12-6/unknown,now 12.6.1-1 amd64 [installed]
cuda-toolkit-12-config-common/unknown,now 12.6.68-1 all [installed,automatic]
cuda-toolkit-config-common/unknown,now 12.6.68-1 all [installed,automatic]
cuda-tools-12-6/unknown,now 12.6.1-1 amd64 [installed,automatic]
cuda-visual-tools-12-6/unknown,now 12.6.1-1 amd64 [installed,automatic]
cudnn9-cuda-12-6/unknown,now 9.4.0.58-1 amd64 [installed,automatic]
cudnn9-cuda-12/unknown,now 9.4.0.58-1 amd64 [installed]
libcudnn9-cuda-12/unknown,now 9.4.0.58-1 amd64 [installed,automatic]
libcudnn9-dev-cuda-12/unknown,now 9.4.0.58-1 amd64 [installed,automatic]
libcudnn9-static-cuda-12/unknown,now 9.4.0.58-1 amd64 [installed,automatic]
libcusparse-12-6/unknown,now 12.5.3.3-1 amd64 [installed]
libcusparse-dev-12-6/unknown,now 12.5.3.3-1 amd64 [installed]
@zhanwenchen zhanwenchen added bug Something isn't working inference labels Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working inference
Projects
None yet
Development

No branches or pull requests

1 participant