Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to load local models #39

Open
shimizust opened this issue Feb 9, 2024 · 4 comments
Open

Ability to load local models #39

shimizust opened this issue Feb 9, 2024 · 4 comments

Comments

@shimizust
Copy link

Hi, thanks for making this project available!

I was wondering if it is possible to point directly to local models? Instead of downloading from a URL or HF Hub?

@gadicc
Copy link
Collaborator

gadicc commented Feb 10, 2024

Hey! Sure, thanks.

Uh in theory this should be pretty easy but I've never tried it personally 😅
Don't forget that if you're deploying your model somewhere else you'll have to include the model in your Docker build.

So, I think it should be this simple:

  1. Set RUNTIME_DOWNLOADS=0
  2. Set MODEL_ID to the directory containing your local model (in diffusers format).
  3. Set MODEL_PRECISION="fp16" if relevant.

That's assuming you only want one model loaded per run. If you want to be able to switch models at runtime, you can just pass MODEL_ID (to the directory containing the local model) as part of your request. You may need to also set RUNTIME_DOWNLOADS=1 but first try without that.

Hope that helps! Let me know either way.

@shimizust
Copy link
Author

Thanks for the tips @gadicc !

I tried this using https://huggingface.co/CompVis/stable-diffusion-v1-4 in a volume mounted to the container. I think it's loading the model fine, but I'm getting an error during inference. Any ideas what I'm doing wrong?

data = {
    "modelInputs": {
        "prompt": "Super dog",
        "num_inference_steps": 5
    },
    "callInputs": {
        "MODEL_ID": "/shared/user/sshimizu/stable-diffusion-v1-4",
        "PIPELINE": "StableDiffusionPipeline",
        "SCHEDULER": "DPMSolverMultistepScheduler",
        "RUNTIME_DOWNLOADS": 0,
        "MODEL_PRECISION": "fp16",
        "safety_checker": "true",
    },
}

json_data = json.dumps(data)
response = requests.post(url, json=data)

I get this error:

{"$error":{"code":"PIPELINE_ERROR","name":"TypeError","message":"unsupported operand type(s) for %: 'int' and 'NoneType'","stack":"Traceback (most recent call last):\n  File \"\/api\/app.py\", line 638, in inference\n    images = (await async_pipeline).images\n  File \"\/opt\/conda\/lib\/python3.10\/asyncio\/threads.py\", line 25, in to_thread\n    return await loop.run_in_executor(None, func_call)\n  File \"\/opt\/conda\/lib\/python3.10\/concurrent\/futures\/thread.py\", line 58, in run\n    result = self.fn(*self.args, **self.kwargs)\n  File \"\/opt\/conda\/lib\/python3.10\/site-packages\/torch\/utils\/_contextlib.py\", line 115, in decorate_context\n    return func(*args, **kwargs)\n  File \"\/opt\/conda\/lib\/python3.10\/site-packages\/diffusers\/pipelines\/stable_diffusion\/pipeline_stable_diffusion.py\", line 1062, in __call__\n    if callback is not None and i % callback_steps == 0:\nTypeError: unsupported operand type(s) for %: 'int' and 'NoneType'\n"}}

And here are the pod logs:

[2024-02-13 01:29:10 +0000] - (sanic.access)[INFO][127.0.0.1:36822]: POST http://localhost:8000/  200 975
{
  "modelInputs": {
    "prompt": "Super dog",
    "num_inference_steps": 5
  },
  "callInputs": {
    "MODEL_ID": "/shared/user/sshimizu/stable-diffusion-v1-4",
    "PIPELINE": "StableDiffusionPipeline",
    "SCHEDULER": "DPMSolverMultistepScheduler",
    "RUNTIME_DOWNLOADS": 0,
    "MODEL_PRECISION": "FP16",
    "safety_checker": "true"
  }
}
download_model {'model_url': None, 'model_id': '/shared/user/sshimizu/stable-diffusion-v1-4', 'model_revision': None, 'hf_model_id': None, 'checkpoint_url': None, 'checkpoint_config_url': None}
loadModel {'model_id': '/shared/user/sshimizu/stable-diffusion-v1-4', 'load': False, 'precision': 'FP16', 'revision': None, 'pipeline_class': <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'>}
pipeline <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'>
Downloading model: /shared/user/sshimizu/stable-diffusion-v1-4
Keyword arguments {'use_auth_token': None} are not expected by StableDiffusionPipeline and will be ignored.
Loading pipeline components...:  57%|█████▋    | 4/7 [00:00<00:00,  7.00it/s]`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["id2label"]` will be overriden.
Loading pipeline components...: 100%|██████████| 7/7 [00:00<00:00,  7.70it/s]
Downloaded in 913 ms
2024-02-13 06:24:09.407129 {'type': 'loadModel', 'status': 'start', 'container_id': 'inference-server', 'time': 1707805449407, 't': 1511, 'tsl': 0, 'payload': {'startRequestId': None}}
loadModel {'model_id': '/shared/user/sshimizu/stable-diffusion-v1-4', 'load': True, 'precision': 'FP16', 'revision': None, 'pipeline_class': <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'>}
pipeline <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'>
Loading model: /shared/user/sshimizu/stable-diffusion-v1-4
Keyword arguments {'use_auth_token': None} are not expected by StableDiffusionPipeline and will be ignored.
Loading pipeline components...:  43%|████▎     | 3/7 [00:00<00:00, 10.72it/s]`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["id2label"]` will be overriden.
Loading pipeline components...: 100%|██████████| 7/7 [00:00<00:00,  8.13it/s]
Loaded from disk in 865 ms, to gpu in 811 ms
2024-02-13 06:24:11.083625 {'type': 'loadModel', 'status': 'done', 'container_id': 'inference-server', 'time': 1707805451084, 't': 3188, 'tsl': 1677, 'payload': {'startRequestId': None}}
Initialized StableDiffusionPipeline for /shared/user/sshimizu/stable-diffusion-v1-4 in 1ms
{'cross_attention_kwargs': {}}
2024-02-13 06:24:11.121568 {'type': 'inference', 'status': 'start', 'container_id': 'inference-server', 'time': 1707805451122, 't': 3226, 'tsl': 0, 'payload': {'startRequestId': None}}
{'callback': <function inference.<locals>.callback at 0x77dc81dfa830>, '**model_inputs': {'prompt': 'Super dog', 'num_inference_steps': 5, 'generator': <torch._C.Generator object at 0x77dc827abf50>}}
 20%|██        | 1/5 [00:00<00:00, 12.56it/s]
[2024-02-13 06:24:11 +0000] - (sanic.access)[INFO][127.0.0.1:45132]: POST http://localhost:8000/  200 975

@gadicc
Copy link
Collaborator

gadicc commented Feb 13, 2024

Hey @shimizust

Looks like a bug... maybe because upstream diffusers removed a default, otherwise I'm not sure why we never saw this before.


Let me first explain the [most relevant lines of the] error and then the fix. You don't need to know or understand any of this, and feel free to skip if it's not of interest

Line: if callback is not None and i % callback_steps == 0
Error: unsupported operand type(s) for %: 'int' and 'NoneType'

So it's trying to calculate "x % y" (modulo operation, i.e. if we divide x by y, what will the remainder be?).
Obviously this requires us to divide to numbers (integers), but in this case, it's warning that the second argument (callback_steps) is not a number, it's a NoneType (i.e. doesn't exist), and this is why we get the error.

Now as to what leads this error is a bit more complicated. In docker-diffusers-api, we automatically set a callback (to be run on every callback_steps) if none is provided. This used to work fine, but I guess now diffusers is expecting callback_steps to be explicitly given if callback is too).


So, the workaround (until I push a proper fix) is to provide a modelInput called callback_steps with an integer value, e.g.

{
   "moduleInputs": {
    // ...
    "callback_steps": 20
  }, // ...
}

This just controls how often we report back the current progress via webhook... if it's irrelevant for your application just use a number higher than your num_inference_steps.


Two other things I noticed (unrelated):

  • You included "RUNTIME_DOWNLOADS": 0 but this is something that's only recognized via an environment variable, not as part of the request.

  • You have safety_checker: "true" but this should be a boolean and not a string, i.e. True not "true". Not really sure how this will affect things but just to avoid any problems further down `:)

Good luck!

@shimizust
Copy link
Author

Thanks @gadicc ! Specifying "callback_steps" to some int in "modelInputs" works and I'm able to generate images from my local model now.

I guess setting RUNTIME_DOWNLOADS isn't strictly necessary then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants