[BUG] AttributeError: 'NoneType' object has no attribute 'set_moe' #6572

zhanwenchen · 2024-09-25T20:09:59Z

Describe the bug
During inference, I got AttributeError: 'NoneType' object has no attribute 'set_moe'. I also noticed Policy type <class 'deepspeed.module_inject.containers.clip.HFCLIPLayerPolicy'> not supported as an INFO log line.

(multivu) zhanwen@zhanwen-mini:~/multivu$ bash scripts/eval.sh | tee log_eval.log
[2024-09-25 15:53:43,055] [INFO] [logging.py:96:log_dist] [Rank 0] Policy type <class 'deepspeed.module_inject.containers.clip.HFCLIPLayerPolicy'> not supported
[rank0]: Traceback (most recent call last):
[rank0]:   File "<frozen runpy>", line 198, in _run_module_as_main
[rank0]:   File "<frozen runpy>", line 88, in _run_code
[rank0]:   File "/home/zhanwen/multivu/tasks/eval/vcgbench/pllava_eval_vcgbench.py", line 220, in <module>
[rank0]:     main()
[rank0]:   File "/home/zhanwen/multivu/tasks/eval/vcgbench/pllava_eval_vcgbench.py", line 210, in main
[rank0]:     result_list = run(args=args, world_size=n_gpus, device_map=device_map, logging_level=logging_level)  # info
[rank0]:                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/multivu/tasks/eval/vcgbench/pllava_eval_vcgbench.py", line 112, in run
[rank0]:     model, processor, dataset = load_model_and_dataset(rank, world_size,
[rank0]:                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/multivu/tasks/eval/vcgbench/pllava_eval_vcgbench.py", line 26, in load_model_and_dataset
[rank0]:     model, processor = load_pllava(pretrained_model_name_or_path, pretrained_processor_name_or_path, num_frames, device_map, use_video_encoder=use_video_encoder, use_dic_queries=use_dic_queries, use_cross_attention=use_cross_attention, use_lora=use_lora, weight_dir=weight_dir, lora_alpha=lora_alpha, pooling_shape=pooling_shape)
[rank0]:                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank0]:     return func(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/multivu/tasks/eval/model_utils.py", line 187, in load_pllava
[rank0]:     ds_engine = init_inference(model,
[rank0]:                 ^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/__init__.py", line 364, in init_inference
[rank0]:     engine = InferenceEngine(model, config=ds_inference_config)
[rank0]:              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/inference/engine.py", line 160, in __init__
[rank0]:     self._apply_injection_policy(config)
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/inference/engine.py", line 421, in _apply_injection_policy
[rank0]:     replace_transformer_layer(client_module, self.module, checkpoint, config, self.config)
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/module_inject/replace_module.py", line 385, in replace_transformer_layer
[rank0]:     replaced_module = replace_module(model=model,
[rank0]:                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/module_inject/replace_module.py", line 634, in replace_module
[rank0]:     replaced_module, _ = _replace_module(model, policy, state_dict=sd)
[rank0]:                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/module_inject/replace_module.py", line 694, in _replace_module
[rank0]:     _, layer_id = _replace_module(child,
[rank0]:                   ^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/module_inject/replace_module.py", line 694, in _replace_module
[rank0]:     _, layer_id = _replace_module(child,
[rank0]:                   ^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/module_inject/replace_module.py", line 694, in _replace_module
[rank0]:     _, layer_id = _replace_module(child,
[rank0]:                   ^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   [Previous line repeated 1 more time]
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/module_inject/replace_module.py", line 670, in _replace_module
[rank0]:     replaced_module = policies[child.__class__][0](child,
[rank0]:                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/module_inject/replace_module.py", line 315, in replace_fn
[rank0]:     new_module = replace_with_policy(child,
[rank0]:                  ^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/zhanwen/miniconda3/envs/multivu/lib/python3.12/site-packages/deepspeed/module_inject/replace_module.py", line 228, in replace_with_policy
[rank0]:     _container.set_moe(moe)
[rank0]:     ^^^^^^^^^^^^^^^^^^
[rank0]: AttributeError: 'NoneType' object has no attribute 'set_moe'

To Reproduce

@torch_no_grad()
def load_pllava(repo_id, pretrained_processor_name_or_path, num_frames, device_map, use_video_encoder, use_dic_queries, use_cross_attention, use_lora=False, weight_dir=None, lora_alpha=32, use_multi_gpus=False, pooling_shape=(16, 12, 12)):
    # print("===============>pooling_shape", pooling_shape)
    if num_frames == 0:
        raise ValueError('load_pllava: num_frames == 0')
        # produce a bug if ever usen the pooling projector
    # dtype = torch_float16 # torch_bfloat16
    torch_dtype = torch_bfloat16 # torch_bfloat16
    logger.info(f'load_pllava: repo_id={repo_id}; weight_dir={weight_dir}')
    repo_path = weight_dir if use_lora else repo_id
    # Path(repo_id).parent
    config = PllavaConfig.from_pretrained(
        # repo_id,
        repo_id,
        use_flash_attention_2=is_gpu_ampere_or_later(),
        pooling_shape=pooling_shape,
        device_map=device_map,
        num_frames=num_frames,
        torch_dtype=torch_dtype,
    )
    config.use_video_encoder = use_video_encoder
    config.use_dic_queries = use_dic_queries
    config.use_cross_attention = use_cross_attention
    logger.info(f'load_pllava: initializing model from repo_id={repo_id}')
    # model = PllavaForConditionalGeneration.from_pretrained(repo_id, config=config, torch_dtype=torch_dtype)
    # model = PllavaForConditionalGeneration.from_pretrained('MODELS/pllava-13b_B/ckpt_epoch02/pytorch_model', config=config, torch_dtype=dtype)
    # model = PllavaForConditionalGeneration.from_pretrained('MODELS/pllava-13b_B', config=config, torch_dtype=dtype)

    # model = PllavaForConditionalGeneration.from_pretrained(repo_id, config=config, torch_dtype=dtype)
    # with init_empty_weights():
    model = PllavaForConditionalGeneration.from_pretrained(repo_id, config=config, torch_dtype=torch_dtype)
    model.eval()
    # model.to('cuda', non_blocking=True) # OOM for multi-gpu, in which case use accelerator.prepare_model(model)
    processor = PllavaProcessor.from_pretrained(repo_id, _attn_implementation="flash_attention_2")
    # processor = PllavaProcessor.from_pretrained('llava-hf/llava-v1.6-vicuna-13b-hf')
    if use_lora is True:
        # peft_config = LoraConfig(task_type=TaskType.CAUSAL_LM, inference_mode=True, target_modules=["q_proj", "v_proj"], r=128, lora_alpha=lora_alpha, lora_dropout=0.)
        peft_config = LoraConfig(task_type=TaskType.CAUSAL_LM, inference_mode=True, target_modules=["q_proj", "v_proj"], r=128, lora_alpha=lora_alpha, lora_dropout=0.)
        # if lora_dir is not None:
        #     model.language_model = PeftModel.from_pretrained(model.language_model, lora_dir, config=peft_config)
        # else:
        model.language_model = language_model = get_peft_model(model.language_model, peft_config)
        language_model.print_trainable_parameters()
        logger.info(f"Finish constructing lora with lora_alpha/128={lora_alpha/128}")

    logger.info(f"load_from_pretrained from {weight_dir}")
    assert weight_dir is not None

    # load weights
    if weight_dir is not None:
        state_dict = {}
        save_fnames = listdir(weight_dir)
        if "model.safetensors" in save_fnames:
            use_full = False
            for fn in save_fnames:
                if fn.startswith('model-0'):
                    use_full = True
                    break
        else:
            use_full = True

        if not use_full:
            print("Loading weight from", weight_dir, "model.safetensors")
            with safe_open(f"{weight_dir}/model.safetensors", framework="pt", device="cpu") as f:
                for k in f.keys():
                    state_dict[k] = f.get_tensor(k)
        else:
            print("Loading weight from", weight_dir)
            for fn in save_fnames:
                if fn.startswith('model-0'):
                    with safe_open(f"{weight_dir}/{fn}", framework="pt", device="cpu") as f:
                        for k in f.keys():
                            state_dict[k] = f.get_tensor(k)

        # if 'model' in state_dict.keys():
        if 'model' in state_dict:
            msg = model.load_state_dict(state_dict['model'], strict=False)
        else:
            msg = model.load_state_dict(state_dict, strict=False)
        print(msg)
    if weight_dir is not None:
        logger.info(f'Loading from pretrained_path: {weight_dir}')
        msg = load_from_pretrained(model, weight_dir, strict=not use_lora)
        if use_video_encoder is True:
            cfg_video_clip = {'arch': 'video_llama', 'image_size': 224, 'drop_path_rate': 0, 'use_grad_checkpoint': False, 'vit_precision': 'fp16', 'freeze_vit': True, 'freeze_qformer': True, 'num_query_token': 32, 'llama_model': 'ckpt/vicuna-7b/', 'prompt': '', 'model_type': 'pretrain_vicuna', 'ckpt': './models_video_clip/askvideos_clip_v0.2.pth', 'max_frame_pos': 32, 'clip_dim_size': 1024, 'num_videoq_hidden_layers': 4}
            # with GatheredParameters(list(module.parameters(recurse=False)), modifier_rank=0):
            #     if get_rank() == 0:
            #         module._load_from_state_dict(*args)
            module = model.model_video_clip if use_video_encoder else model
            if ckpt_path := cfg_video_clip['ckpt']:
                print(f"Load first Checkpoint: {ckpt_path}")
                with GatheredParameters(list(module.parameters(recurse=False)), modifier_rank=0), Accelerator().main_process_first():
                    # if get_rank() == 0:
                    url_or_filename='./models_video_clip/blip2_pretrained_flant5xxl.pth'
                    ckpt = torch_load(url_or_filename, map_location="cpu")
                    error_message = load_state_dict_into_model(module, ckpt['model'], '')
                    logger.info(error_message) # msg = model.model_video_clip.load_state_dict(ckpt['model'], strict=False)
                    # module.load_from_pretrained()

                    ckpt = torch_load(ckpt_path, map_location="cpu")
                    error_message = load_state_dict_into_model(module, ckpt['model'], '')
                    # msg = module.load_state_dict(ckpt['model'], strict=False)
                    # msg = module._load_from_state_dict(ckpt['model'], strict=False)
                    # msg = module._load_from_state_dict(ckpt['model'], strict=False)
                    logger.info(error_message) # msg = model.model_video_clip.load_state_dict(ckpt['model'], strict=False)

    from accelerate.utils import calculate_maximum_sizes, compute_module_sizes
    sizes = calculate_maximum_sizes(model)
    print(f'calculate_maximum_sizes: {sizes}')
    print(f'compute_module_sizes: {compute_module_sizes(model)}')
    # model = accelerator.prepare_model(model, device_placement=False, evaluation_mode=True)
    # Initialize the DeepSpeed-Inference engine
    from os import environ
    world_size = int(environ.get('WORLD_SIZE', '1'))
    ds_engine = init_inference(model,
                               tensor_parallel={"tp_size": world_size}, # if none, then oom.
                               dtype=torch_dtype,
                               replace_with_kernel_inject=True)
    model = ds_engine.module

    return model, processor

Expected behavior
DeepSpeed should not expose a NoneType error to the user.

ds_report output
Please run ds_report to give us details about your setup.
ds_report.log

Screenshots
If applicable, add screenshots to help explain your problem.
Attached log_eval.log

System info (please complete the following information):

OS: Ubuntu 24.04 LTS
GPU count and types: 2xGeForce 3090 Ti 24GB with NVLink
(if applicable) what DeepSpeed-MII version are you using: N/A
(if applicable) Hugging Face Transformers/Accelerate/etc. versions:

(multivu) zhanwen@zhanwen-mini:~/multivu$ pip list | grep -e transformers -e accelerate -e deepspeed -e torch
accelerate                     0.34.2
deepspeed                      0.15.1+10ba3dde
deepspeed-kernels              0.0.1.dev1698255861
torch                          2.4.0a0+gitee1b680
torchvision                    0.19.1a0+6194369
transformers                   4.44.2

Python version: Python 3.12.3 | packaged by Anaconda, Inc. | (main, May 6 2024, 19:46:43) [GCC 11.2.0] on linux
Any other relevant info about your setup

Docker context
Are you using a specific docker image that you can share?

Additional context
Add any other context about the problem here.

I built magma-cuda126 (2.6.1), torch (2.4.1), torchvision (0.19.1) and deepspeed (0.15.1) from source under the same environment with the following libraries built from source:

nccl (2.22.3)
cusparselt (0.6.2.3)

and the following libraries installed from apt:

(multivu) zhanwen@zhanwen-mini:~/multivu$ apt list --installed | grep -e cuda -e cudnn -e nccl -e cusparse 

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

cuda-cccl-12-6/unknown,now 12.6.37-1 amd64 [installed,automatic]
cuda-command-line-tools-12-6/unknown,now 12.6.1-1 amd64 [installed,automatic]
cuda-compiler-12-6/unknown,now 12.6.1-1 amd64 [installed,automatic]
cuda-crt-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-cudart-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-cudart-dev-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-cuobjdump-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-cupti-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-cupti-dev-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-cuxxfilt-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-documentation-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-driver-dev-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-drivers-560/unknown,now 560.35.03-1 amd64 [installed,automatic]
cuda-drivers-fabricmanager-560/unknown,now 560.35.03-1 amd64 [installed]
cuda-gdb-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-keyring/unknown,unknown,now 1.1-1 all [installed]
cuda-libraries-12-6/unknown,now 12.6.1-1 amd64 [installed,automatic]
cuda-libraries-dev-12-6/unknown,now 12.6.1-1 amd64 [installed,automatic]
cuda-nsight-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nsight-compute-12-6/unknown,now 12.6.1-1 amd64 [installed,automatic]
cuda-nsight-systems-12-6/unknown,now 12.6.1-1 amd64 [installed,automatic]
cuda-nvcc-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvdisasm-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvml-dev-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvprof-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvprune-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvrtc-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvrtc-dev-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvtx-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvvm-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-nvvp-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-opencl-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-opencl-dev-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-profiler-api-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-sanitizer-12-6/unknown,now 12.6.68-1 amd64 [installed,automatic]
cuda-toolkit-12-6-config-common/unknown,now 12.6.68-1 all [installed,automatic]
cuda-toolkit-12-6/unknown,now 12.6.1-1 amd64 [installed]
cuda-toolkit-12-config-common/unknown,now 12.6.68-1 all [installed,automatic]
cuda-toolkit-config-common/unknown,now 12.6.68-1 all [installed,automatic]
cuda-tools-12-6/unknown,now 12.6.1-1 amd64 [installed,automatic]
cuda-visual-tools-12-6/unknown,now 12.6.1-1 amd64 [installed,automatic]
cudnn9-cuda-12-6/unknown,now 9.4.0.58-1 amd64 [installed,automatic]
cudnn9-cuda-12/unknown,now 9.4.0.58-1 amd64 [installed]
libcudnn9-cuda-12/unknown,now 9.4.0.58-1 amd64 [installed,automatic]
libcudnn9-dev-cuda-12/unknown,now 9.4.0.58-1 amd64 [installed,automatic]
libcudnn9-static-cuda-12/unknown,now 9.4.0.58-1 amd64 [installed,automatic]
libcusparse-12-6/unknown,now 12.5.3.3-1 amd64 [installed]
libcusparse-dev-12-6/unknown,now 12.5.3.3-1 amd64 [installed]

The text was updated successfully, but these errors were encountered:

zhanwenchen added bug Something isn't working inference labels Sep 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] AttributeError: 'NoneType' object has no attribute 'set_moe' #6572

[BUG] AttributeError: 'NoneType' object has no attribute 'set_moe' #6572

zhanwenchen commented Sep 25, 2024 •

edited

Loading

[BUG] AttributeError: 'NoneType' object has no attribute 'set_moe' #6572

[BUG] AttributeError: 'NoneType' object has no attribute 'set_moe' #6572

Comments

zhanwenchen commented Sep 25, 2024 • edited Loading

zhanwenchen commented Sep 25, 2024 •

edited

Loading