lora support, re-organize files.

matatonic · Sep 6, 2024 · 9636570 · 9636570
1 parent 417fdc6
commit 9636570
Show file tree

Hide file tree

Showing 31 changed files with 367 additions and 151 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,13 +1,19 @@
 .env
 images.env
-test-*/
+test*/
 hf_home/
-debug.png
-output.png
-new_logo*.png
-config/config.json
+models/hf_home/
+*.png
+!logo.png
+!open-webui-config.png
 TODO.md
 
+config/*.json
+lora/*
+!lora/lora_files_go_here.txt
+models/*
+!models/model_files_go_here.txt
+
 # Byte-compiled / optimized / DLL files
 __pycache__/
 *.py[cod]
@@ -168,3 +174,5 @@ cython_debug/
 #  and can be added to the global gitignore or merged into this file.  For a more nuclear
 #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
 #.idea/
+
+!config/lib
diff --git a/CONFIG.md b/CONFIG.md
@@ -0,0 +1,154 @@
+# OpenedAI Images Flux
+
+## Configuration
+
+All of the configuration settings are stored in the `config/` folder, and can (mostly*) be modified as needed without needing to restart the server.
+
+`config.json` is the primary configuration file, it contains the mapping of `model` to `generator` and `enhancer` configurations. A new `config.json` will be created if one doesn't exist already.
+
+A basic `config.json` might look like this:
+
+```json
+{
+  "models": {
+    "dall-e-2": {
+      "generator": "flux.1-schnell.json"
+    },
+    "dall-e-3": {
+      "generator": "flux.1-dev.json",
+      "enhancer": "openai-enhancer.json"
+    }
+}
+```
+
+The default `config.default.json` provided is much more robust with many more options available.
+
+### Generator JSON Configuration
+
+Generation parameters can be set with `quality` and is completely configurable and can be anything want, the `standard` and `hd` settings are available in the OpenAI API, but you can use whatever you want.
+
+Sample Generator JSON:
+
+```json
+{
+  "pipeline": {
+    "pretrained_model_name_or_path": "black-forest-labs/FLUX.1-dev",
+    "torch_dtype": "bfloat16"
+  },
+  "options": {
+    "enable_sequential_cpu_offload": false,
+    "enable_model_cpu_offload": false,
+    "enable_vae_slicing": false,
+    "enable_vae_tiling": false,
+    "to": {
+      "device": "cuda"
+    }
+  },
+  "generation_kwargs": {
+    "standard": {
+      "guidance_scale": 3.5,
+      "num_inference_steps": 25
+    },
+    "hd": {
+      "guidance_scale": 5.5,
+      "num_inference_steps": 50
+    }
+  }
+}
+
+```
+
+The format is very flexible and many entries are not pre-defined but are used as keywords in API calls to `diffusers` python objects.
+
+#### Local model files
+
+Here is another sample of how to use local model files with a fine-tune without downloading from huggingface:
+
+```json
+{
+  "pipeline": {
+    "pretrained_model_name_or_path": "./models/black-forest-labs/FLUX.1-dev",
+    "torch_dtype": "bfloat16",
+    "FluxTransformer2DModel": {
+      "pretrained_model_link_or_path_or_dict": "./models/STOIQONewrealityFLUXSD_F1DPreAlpha.safetensors",
+      "torch_dtype": "bfloat16"
+    }
+  },
+  "options": {
+    "to": {
+      "device": "cuda"
+    }
+  },
+  "generation_kwargs": {
+    "guidance_scale": 3.5,
+    "num_inference_steps": 50
+  }
+}
+```
+
+#### Lora Configuration
+
+Multiple lora can be added in a list, with individual scaling factor (`lora_scale`), which is used when fusing lora with the main model.
+
+Sample Lora config:
+
+```json
+{
+  "pipeline": {
+    "pretrained_model_name_or_path": "black-forest-labs/FLUX.1-dev",
+    "torch_dtype": "bfloat16",
+    "Loras": [
+      {
+        "weights": {
+          "pretrained_model_name_or_path_or_dict": "./lora",
+          "weight_name": "some_lora_file_1.safetensors"
+        },
+        "lora_scale": 0.8
+      },
+      {
+        "weights": {
+          "pretrained_model_name_or_path_or_dict": "./lora",
+          "weight_name": "some_other_lora_file_2.safetensors"
+        },
+        "lora_scale": 1.0
+      }
+    ]
+  },
+  "options": {
+    "enable_sequential_cpu_offload": true,
+    "to": {
+      "device": "cuda"
+    }
+  },
+  "generation_kwargs": {
+    "guidance_scale": 3.5,
+    "num_inference_steps": 50
+  }
+}
+```
+
+### Enhancer JSON Configuration
+
+Enhancers are optional but provide additional functionality to improve image quality and creativity. They depend on an OpenAI chat API.
+
+A simple `enhancer.json` may look like this (examples tend to work well):
+
+```json
+{
+  "model": "gpt-3.5-turbo",
+  "messages": [
+    {"role": "system", "content": "You are an AI prompt enhancer for image generation AI models, when given a description of an image, be creative and enhance the description to make the image better. No yapping, just output the best enhanced image prompt you can."},
+    {"role": "user", "content": "photo of an ancient castle very atmospheric"},
+    {"role": "assistant", "content": "photo of an ancient castle with a majestic, eerie ambiance, highlighting its towering spires, weathered stone, and ivy-clad walls, set against a backdrop of a dense, mystic forest under a sky completely shrouded in clouds yet illuminated by a diffused, ethereal light, where the fusion of natural decay and enduring architectural grandeur evokes a sense of timeless mystery and the eternal battle between man and nature, casting a glow that reveals subtle details and textures, creating an impression of haunting beauty and solemn tranquility. "},
+    {"role": "user", "content": "photo of a futuristic cityscape with flying cars"},
+    {"role": "assistant", "content": " A bustling futuristic cityscape stretches out beneath a twilight sky, where sleek, metallic skyscrapers pierce through wisps of neon-lit clouds. Flying cars, adorned with glowing blue and purple lights, zip silently along invisible aerial highways, casting dynamic shadows on the glassy surfaces below. The city’s architecture blends organic curves with sharp geometric lines, creating a harmonious fusion of nature and technology. Holographic billboards flicker with vibrant advertisements, while pedestrians in streamlined attire stroll along elevated walkways lined with lush vertical gardens. The scene is bathed in a soft, ethereal glow from the setting sun, reflecting off the polished surfaces and adding a warm contrast to the cool artificial lights. The composition captures the city from a bird’s-eye view, showcasing its intricate layout and futuristic charm, evoking a sense of wonder and technological advancement."}
+  ],
+  "temperature": 1.0,
+  "max_tokens": 256
+}
+```
+
+There are 2 pre-configured rule based openai enhancers, `openai-enhancer.json` which contains some additional safety features, and `openai-enhancer-research.json` which is for scientists. Sometimes it's a stunning difference, but they don't always work perfectly. The results are highly dependent on the quality of the chat model you use, so feel free to create your own and experiment.
+
+You may also find that FLUX.1 performs very well with no prompt enhancement at all.
+
diff --git a/Dockerfile b/Dockerfile
@@ -1,15 +1,16 @@
 FROM python:3.11-slim
-
 # For qint4 support, 2x docker image
 #FROM nvidia/cuda:12.1.0-devel-ubuntu22.04
-#RUN apt-get update && apt-get install --no-install-recommends -y build-essential python3-pip python-is-python3 python3-dev && apt-get clean && rm -rf /var/lib/apt/lists/*
+RUN apt-get update && apt-get install --no-install-recommends -y \
+  build-essential python3-pip python-is-python3 python3-dev \
+  && apt-get clean && rm -rf /var/lib/apt/lists/*
 
 WORKDIR /app
-RUN mkdir config
+RUN mkdir config models lora
 COPY requirements.txt .
 RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt
+COPY config/lib /app/config/
 COPY *.py *.json LICENSE /app/
-COPY config/*.json /app/config/
 
 ENV CLI_COMMAND="python images.py"
 CMD $CLI_COMMAND
diff --git a/README.md b/README.md
@@ -17,8 +17,8 @@ An OpenAI API compatible image generation server for the FLUX.1 family of models
 - **Flexible**: Configurable settings for different models and enhancements
 - **Enhancements**: Supports flexible AI prompt enhancers
 - **Standalone Image Generation**: Uses your Nvidia GPU for image generation, doesn't use ComfyUI, SwarmUI or any other backend
+- **Lora Support**: Support for multiple loras with individual scaling weights (strength)
 - [ ] **Easy to setup and use**: Maybe?
-- [ ] **Lora Support** (planned)
 - [ ] **Upscaler Support** (planned)
 - [ ] **BNB NF4 Quantization** (planned)
 - [ ] **GGUF Loading** (planned)
@@ -112,83 +112,11 @@ See the OpenAI Images Guide API and API Documentation for more ways to use the A
 
 ## Configuration
 
-All of the configuration settings are stored in the `config/` folder, and can (mostly*) be modified as needed without needing to restart the server.
-
-`config.json` is the primary configuration file, it contains the mapping of `model` to `generator` and `enhancer` configurations.
-
-A basic `config.json` might look like this:
-```json
-{
-  "models": {
-    "dall-e-2": {
-      "generator": "flux.1-schnell.json"
-    },
-    "dall-e-3": {
-      "generator": "flux.1-dev.json",
-      "enhancer": "openai-enhancer.json"
-    }
-}
-```
-
-The default `config.json` provided is much more robust with many more options available.
-
-### Generator JSON Configuration
-
-Generation parameters can be set with `quality` and is completely configurable and can be anything want, the `standard` and `hd` settings are available in the OpenAI API, but you can use whatever you want.
-
-```json
-{
-  "pipeline": {
-    "pretrained_model_name_or_path": "black-forest-labs/FLUX.1-dev",
-    "torch_dtype": "bfloat16"
-  },
-  "options": {
-    "enable_sequential_cpu_offload": false,
-    "to": {
-      "device": "cuda"
-    }
-  },
-  "generation_kwargs": {
-    "standard": {
-      "guidance_scale": 3.5,
-      "num_inference_steps": 25
-    },
-    "hd": {
-      "guidance_scale": 5.5,
-      "num_inference_steps": 50
-    }
-  }
-}
+This server is designed to work out of the box with no extra configuration, but it's easy to tinker with.
 
-```
-
-The format is very flexible and many entries are not pre-defined but are used as is in API calls to `diffusers` python objects.
-
-### Enhancer JSON Configuration
-
-Enhancers are optional but provide additional functionality to improve image quality and creativity. They depend on an OpenAI chat API.
-
-A simple `enhancer.json` may look like this (examples tend to work well):
-```json
-{
-  "model": "gpt-3.5-turbo",
-  "messages": [
-    {"role": "system", "content": "You are an AI prompt enhancer for image generation AI models, when given a description of an image, be creative and enhance the description to make the image better. No yapping, just output the best enhanced image prompt you can."},
-    {"role": "user", "content": "photo of an ancient castle very atmospheric"},
-    {"role": "assistant", "content": "photo of an ancient castle with a majestic, eerie ambiance, highlighting its towering spires, weathered stone, and ivy-clad walls, set against a backdrop of a dense, mystic forest under a sky completely shrouded in clouds yet illuminated by a diffused, ethereal light, where the fusion of natural decay and enduring architectural grandeur evokes a sense of timeless mystery and the eternal battle between man and nature, casting a glow that reveals subtle details and textures, creating an impression of haunting beauty and solemn tranquility. "},
-    {"role": "user", "content": "photo of a futuristic cityscape with flying cars"},
-    {"role": "assistant", "content": " A bustling futuristic cityscape stretches out beneath a twilight sky, where sleek, metallic skyscrapers pierce through wisps of neon-lit clouds. Flying cars, adorned with glowing blue and purple lights, zip silently along invisible aerial highways, casting dynamic shadows on the glassy surfaces below. The city’s architecture blends organic curves with sharp geometric lines, creating a harmonious fusion of nature and technology. Holographic billboards flicker with vibrant advertisements, while pedestrians in streamlined attire stroll along elevated walkways lined with lush vertical gardens. The scene is bathed in a soft, ethereal glow from the setting sun, reflecting off the polished surfaces and adding a warm contrast to the cool artificial lights. The composition captures the city from a bird’s-eye view, showcasing its intricate layout and futuristic charm, evoking a sense of wonder and technological advancement."}
-  ],
-  "temperature": 1.0,
-  "max_tokens": 256
-}
-```
+There is a more detailed configuration guide in the [CONFIG.md](CONFIG.md).
 
-There are 2 pre-configured rule based openai enhancers, `openai-enhancer.json` which contains some additional safety features, and `openai-enhancer-research.json` which is for scientists. They don't always work well, and the results are highly dependent on the quality of the chat model you use, so feel free to create your own and experiment.
-
-You may also find that FLUX.1 performs very well with no prompt enhancement at all.
-
-## Additional Pre-Configured models
+## Pre-Configured models
 
 > FP8 is the only available quantization so far, but more will come soon!
 
@@ -226,8 +154,14 @@ Low VRAM options (<4GB VRAM, ~32GB RAM, `+enable_sequential_cpu_offload`, float1
 - `dev-low`: `flux.1-dev-low.json` Dev FP16, at least a few minutes per image
 - `merged-low`: `sayakpaul-flux.1-merged-low.json` Dev+Schnell FP16 merged, 12 steps by default
 
+High VRAM options (80GB VRAM, float32 inference, really slow)
+
+- `dev-f32`: `flux.1-dev-fp32.json` Dev in full float32 glory, no loss.
+- `schnell-f32`: `flux.1-schnell-f32.json` Schnell in full float32 glory, no loss.
+
+And more, check out the `config/lib` folder for more examples, including lora options.
 
-> Timings are measured at 1024x1024 on an Nvidia A100 and may vary wildly from your system.
+> Timings are casually measured at 1024x1024 standard on an Nvidia A100 and may vary wildly from your system.
 
 > \*) The name of the generator file is used to determine if a model is already loaded or not, if you edit a generator config in a way which requires reloading the model (such as changing `pipeline` or `options`), it wont reload it automatically. `config.json` and `generation_kwargs` will always be loaded each API call.