diff --git a/README.md b/README.md index 57154cf..155312d 100644 --- a/README.md +++ b/README.md @@ -1 +1,137 @@ -# Defect-Diffusion-Model \ No newline at end of file +# IDDM: Industrial Defect Diffusion Model + +[中文文档](README_zh.md) + +### About the Model + +The diffusion model used in this project is based on the classic ddpm introduced in the paper "[Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239)". + +We named this project IDDM: Industrial Defect Diffusion Model. It aims to reproduce the model, write trainers and generators, and improve and optimize certain algorithms and network structures. This repository is **actively maintained**. + +**Repository Structure** + +```yaml +├── datasets +├── model +│ ├── ddpm.py +│ ├── modules.py +│ └── network.py +├── results +├── test +│ ├── noising_test +│ │ ├── landscape +│ │ │ └── noising_test.jpg +│ │ └── noise +│ │ └── noise.jpg +│ └── test_module.py +├── tools +│ ├── generate.py +│ └── train.py +├── utils +│ ├── initializer.py +│ └── utils.py +└── weight +``` + +### Next Steps + +- [ ] 1. Implement cosine learning rate optimization. +- [ ] 2. Use a more advanced U-Net network model. +- [ ] 3. Generate larger-sized images. +- [ ] 4. Implement multi-GPU distributed training. +- [ ] 5. Enable fast deployment on cloud servers. + +### Training + +1. Take the `landscape` dataset as an example and place the dataset files in the `datasets` folder. The overall path of the dataset should be `/your/path/datasets/landscape`, and the image files should be located at `/your/path/datasets/landscape/*.jpg`. + +2. Open the `train.py` file and locate the `--dataset_path` parameter. Modify the path in the parameter to the overall dataset path, for example, `/your/path/datasets/landscape`. + +3. Set the necessary parameters such as `--conditional`, `--run_name`, `--epochs`, `--batch_size`, `--image_size`, `--result_path`, etc. If no parameters are set, the default settings will be used. There are two ways to set the parameters: directly modify the `parser` in the `if __name__ == "__main__":` section of the `train.py` file, or run the following command in the terminal at the `/your/path/Defect-Diffusion-Model/tools` directory: + **Conditional Training Command** + + ```bash + python train.py --conditional True --run_name 'df' --epochs 300 --batch_size 16 --image_size 64 --num_classes 10 + ``` + **Unconditional Training Command** + + ```bash + python train.py --conditional False --run_name 'df' --epochs 300 --batch_size 16 --image_size 64 + ``` +4. Wait for the training to complete. +5. If the training is interrupted due to any reason, you can resume it by setting `--resume` to `True` in the `train.py` file, specifying the epoch number where the interruption occurred, providing the folder name of the interrupted training (`run_name`), and running the file again. Alternatively, you can use the following command to resume the training: + **Conditional Resume Training Command** + + ```bash + python train.py --resume True --start_epoch 10 --load_model_dir 'df' --conditional True --run_name 'df' --epochs 300 --batch_size 16 --image_size 64 --num_classes 10 + ``` + + +**Parameter Explanation** + +| Parameter Name | Conditional | Usage | Type | Description | +| ---------------------- | :---------: | ------------------------------- | :--: | ------------------------------------------------------------ | +| --conditional | | Enable conditional training | bool | Enable to modify custom configurations, such as modifying the number of classes and classifier-free guidance interpolation weights | +| --run_name | | File name | str | File name used to initialize the model and save information | +| --epochs | | Total number of epochs | int | Total number of training epochs | +| --batch_size | | Training batch size | int | Size of each training batch | +| --num_workers | | Number of loading processes | int | Number of subprocesses used for data loading. It consumes a large amount of CPU and memory but can speed up training | +| --image_size | | Input image size | int | Input image size. Adaptive input and output sizes | +| --dataset_path | | Dataset path | str | Path to the conditional dataset, such as CIFAR-10, with each class in a separate folder, or the path to the unconditional dataset with all images in one folder | +| --fp16 | | Half precision training | bool | Enable half precision training. It effectively reduces GPU memory usage but may affect training accuracy and results | +| --distributed | | Distributed training | bool | TODO | +| --optim | | Optimizer | str | Optimizer selection. Currently supports Adam and AdamW | +| --lr | | Learning rate | int | Initial learning rate. Currently only supports linear learning rate | +| --result_path | | Save path | str | Path to save the training results | +| --save_model_interval | | Save model after each training | bool | Whether to save the model after each training iteration for model selection based on visualization | +| --start_model_interval | | Start epoch for saving models | int | Start epoch for saving models. This option saves disk space. If not set, the default is -1. If set, it starts saving models from the specified epoch. It needs to be used with --save_model_interval | +| --vis | | Visualize dataset information | bool | Enable visualization of dataset information for model selection based on visualization | +| --resume | | Resume interrupted training | bool | Set to "True" to resume interrupted training. Note: If the epoch number of interruption is outside the condition of --start_model_interval, it will not take effect. For example, if the start saving model time is 100 and the interruption number is 50, we cannot set any loading epoch points because we did not save the model. We save the xxx_last.pt file every training, so we need to use the last saved model for interrupted training | +| --start_epoch | | Epoch number of interruption | int | Epoch number where the training was interrupted | +| --load_model_dir | | Folder name of the loaded model | str | Folder name of the previously loaded model | +| --num_classes | ✓ | Number of classes | int | Number of classes used for classification | +| --cfg_scale | ✓ | Classifier-free guidance weight | int | Classifier-free guidance interpolation weight for better model generation effects | + +### Generation + +1. Open the `generate.py` file and locate the `--weight_path` parameter. Modify the path in the parameter to the path of your model weights, for example `/your/path/weight/model.pt`. + +2. Set the necessary parameters such as `--conditional`, `--generate_name`, `--num_images`, `--num_classes`, `--class_name`, `--image_size`, `--result_path`, etc. If no parameters are set, the default settings will be used. There are two ways to set the parameters: one is to directly modify the `parser` in the `if __name__ == "__main__":` section of the `generate.py` file, and the other is to use the following commands in the console while in the `/your/path/Defect-Diffusion-Model/tools` directory: + + **Conditional Generation Command** + + ```bash + python generate.py --conditional True --generate_name 'df' --num_images 8 --num_classes 10 --class_name 0 --image_size 64 --weight_path '/your/path/weight/model.pt' + ``` + + **Unconditional Generation Command** + + ```bash + python generate.py --conditional False --generate_name 'df' --num_images 8 --image_size 64 --weight_path '/your/path/weight/model.pt' + ``` + +3. Wait for the generation process to complete. + +**Parameter Explanation** + +| Parameter Name | Conditional | Usage | Type | Description | +| --------------- | :---------: | ------------------------------- | :--: | ------------------------------------------------------------ | +| --conditional | | Enable conditional generation | bool | If enabled, allows custom configuration, such as modifying classes or classifier-free guidance interpolation weights | +| --generate_name | | File name | str | File name to initialize the model for saving purposes | +| --image_size | | Input image size | int | Size of input images, adaptive input/output size | +| --num_images | | Number of generated images | int | Number of images to generate | +| --weight_path | | Path to model weights | str | Path to the model weights file, required for network generation | +| --result_path | | Save path | str | Path to save the generated images | +| --num_classes | ✓ | Number of classes | int | Number of classes for classification | +| --class_name | ✓ | Class name | int | Index of the class to generate images for | +| --cfg_scale | ✓ | Classifier-free guidance weight | int | Weight for classifier-free guidance interpolation, for better generation model performance | + +### Deployment + +To be continued. + +### Acknowledgements + +[@dome272](https://github.com/dome272/Diffusion-Models-pytorch) + +[@OpenAi](https://github.com/openai/improved-diffusion) \ No newline at end of file diff --git a/README_zh.md b/README_zh.md new file mode 100644 index 0000000..7c357cf --- /dev/null +++ b/README_zh.md @@ -0,0 +1,149 @@ +# IDDM:工业缺陷扩散模型 + +[English Document](README.md) + +### 关于模型 + +该扩散模型为经典的ddpm,来源于论文《**[Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239)**》 + +我们将此项目命名为IDDM: Industrial Defect Diffusion Model,中文名为工业缺陷扩散模型。在此项目中进行模型复现、训练器和生成器编写、部分算法和网络结构的改进与优化,该仓库**持续维护**。 + +**本仓库整体结构** + +```yaml +├── datasets +├── model +│   ├── ddpm.py +│   ├── modules.py +│   └── network.py +├── results +├── test +│   ├── noising_test +│   │   ├── landscape +│   │   │   └── noising_test.jpg +│   │   └── noise +│   │   └── noise.jpg +│   └── test_module.py +├── tools +│   ├── generate.py +│   └── train.py +├── utils +│   ├── initializer.py +│   └── utils.py +└── weight +``` + +### 接下来要做 + +- [ ] 1. 新增cosine学习率优化 +- [ ] 2. 使用效果更优的U-Net网络模型 +- [ ] 3. 更大尺寸的生成图像 +- [ ] 4. 多卡分布式训练 +- [ ] 5. 云服务器快速部署 + +### 训练 + +1. 以`landscape`数据集为例,将数据集文件放入`datasets`文件夹中,该数据集的总路径如下`/your/path/datasets/landscape`,数据集图片路径如下`/your/path/datasets/landscape/*.jpg` + +2. 打开`train.py`文件,找到`--dataset_path`参数,将参数中的路径修改为数据集的总路径,例如`/your/path/datasets/landscape` + +3. 设置必要参数,例如`--conditional`,`--run_name`,`--epochs`,`--batch_size`,`--image_size`,`--result_path`等参数,若不设置参数则使用默认设置。我们有两种参数设置方法,其一是直接对`train.py`文件`if __name__ == "__main__":`中的`parser`进行设置;其二是在控制台在`/your/path/Defect-Diffiusion-Model/tools`路径下输入以下命令: + **有条件训练命令** + + ```bash + python train.py --conditional True --run_name 'df' --epochs 300 --batch_size 16 --image_size 64 --num_classes 10 + ``` + **无条件训练命令** + + ```bash + python train.py --conditional False --run_name 'df' --epochs 300 --batch_size 16 --image_size 64 + ``` +4. 等待训练即可 +5. 若因异常原因中断训练,我们可以在`train.py`文件,首先将`--resume`设置为`True`,其次设置异常中断的迭代编号,再写入该次训练的所在文件夹(run_name),最后运行文件即可。也可以使用如下命令进行恢复: + **有条件恢复训练命令** + + ```bash + python train.py --resume True --start_epoch 10 --load_model_dir 'df' --conditional True --run_name 'df' --epochs 300 --batch_size 16 --image_size 64 --num_classes 10 + ``` + **无条件恢复训练命令** + + ```bash + python train.py --resume True --start_epoch 10 --load_model_dir 'df' --conditional False --run_name 'df' --epochs 300 --batch_size 16 --image_size 64 + ``` + + + +**参数讲解** + +| **参数名称** | 条件参数 | 参数使用方法 | 参数类型 | 参数解释 | +| ---------------------- | :------: | -------------------------------- | :------: | ------------------------------------------------------------ | +| --conditional | | 开启条件训练 | bool | 若开启可修改自定义配置,例如修改类别、classifier-free guidance插值权重 | +| --run_name | | 文件名称 | str | 初始化模型的文件名称,用于设置保存信息 | +| --epochs | | 总迭代次数 | int | 训练总迭代次数 | +| --batch_size | | 训练批次 | int | 训练批次大小 | +| --num_workers | | 加载进程数量 | int | 用于数据加载的子进程数量,大量占用CPU和内存,但可以加快训练速度 | +| --image_size | | 输入图像大小 | int | 输入图像大小,自适应输入输出尺寸 | +| --dataset_path | | 数据集路径 | str | 有条件数据集,例如cifar10,每个类别一个文件夹,路径为主文件夹;无条件数据集,所有图放在一个文件夹,路径为图片文件夹 | +| --fp16 | | 半精度训练 | bool | 开启半精度训练,有效减少显存使用,但无法保证训练精度和训练结果 | +| --distributed | | 分布式训练 | bool | TODO | +| --optim | | 优化器 | str | 优化器选择,目前支持adam和adamw | +| --lr | | 学习率 | int | 初始化学习率,目前仅支持线性学习率 | +| --result_path | | 保存路径 | str | 保存路径 | +| --save_model_interval | | 是否每次训练储存 | bool | 是否每次训练储存,根据可视化生成样本信息筛选模型 | +| --start_model_interval | | 设置开始每次训练存储编号 | int | 设置开始每次训练存储的epoch编号,该设置可节约磁盘空间,若不设置默认-1,若设置则从第epoch时开始保存每次训练pt文件,需要与--save_model_interval同时开启 | +| --vis | | 可视化数据集信息 | bool | 打开可视化数据集信息,根据可视化生成样本信息筛选模型 | +| --resume | | 中断恢复训练 | bool | 恢复训练将设置为“True”。注意:设置异常中断的epoch编号若在--start_model_interval参数条件外,则不生效。例如开始保存模型时间为100,中断编号为50,由于我们没有保存模型,所以无法设置任意加载epoch点。每次训练我们都会保存xxx_last.pt文件,所以我们需要使用最后一次保存的模型进行中断训练 | +| --start_epoch | | 中断迭代编号 | int | 设置异常中断的epoch编号 | +| --load_model_dir | | 加载模型所在文件夹 | str | 写入中断的epoch上一个加载模型的所在文件夹 | +| --num_classes | 是 | 类别个数 | int | 类别个数,用于区分类别 | +| --cfg_scale | 是 | classifier-free guidance插值权重 | int | classifier-free guidance插值权重,用户更好生成模型效果 | + + + + +### 生成 + +1. 打开`generate.py`文件,找到`--weight_path`参数,将参数中的路径修改为模型权重路径,例如`/your/path/weight/model.pt` + +2. 设置必要参数,例如`--conditional`,`--generate_name`,`--num_images`,`--num_classes`,`--class_name`,`--image_size`,`--result_path`等参数,若不设置参数则使用默认设置。我们有两种参数设置方法,其一是直接对`generate.py`文件`if __name__ == "__main__":`中的`parser`进行设置;其二是在控制台在`/your/path/Defect-Diffiusion-Model/tools`路径下输入以下命令: + **有条件生成命令** + + ```bash + python generate.py --conditional True --generate_name 'df' --num_images 8 --num_classes 10 --class_name 0 --image_size 64 --weight_path '/your/path/weight/model.pt' + ``` + + **无条件生成命令** + + ```bash + python generate.py --conditional False --generate_name 'df' --num_images 8 --image_size 64 --weight_path '/your/path/weight/model.pt' + ``` + +3. 等待生成即可 + + + +**参数讲解** + +| **参数名称** | 条件参数 | 参数使用方法 | 参数类型 | 参数解释 | +| --------------- | :------: | -------------------------------- | :------: | ------------------------------------------------------------ | +| --conditional | | 开启条件生成 | bool | 若开启可修改自定义配置,例如修改类别、classifier-free guidance插值权重 | +| --generate_name | | 文件名称 | str | 初始化模型的文件名称,用于设置保存信息 | +| --image_size | | 输入图像大小 | int | 输入图像大小,自适应输入输出尺寸 | +| --num_images | | 生成图片个数 | int | 单次生成图片个数 | +| --weight_path | | 权重路径 | str | 模型权重路径,网络生成需要加载文件 | +| --result_path | | 保存路径 | str | 保存路径 | +| --num_classes | 是 | 类别个数 | int | 类别个数,用于区分类别 | +| --class_name | 是 | 类别名称 | int | 类别序号,用于对指定类别生成 | +| --cfg_scale | 是 | classifier-free guidance插值权重 | int | classifier-free guidance插值权重,用户更好生成模型效果 | + + + +### 部署 + +未完待续 + +### 致谢 + +[@dome272](https://github.com/dome272/Diffusion-Models-pytorch) + +[@OpenAi](https://github.com/openai/improved-diffusion)