[POC] SAFETY CHECK DRAFT DO NOT MERGE #74
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Here's a quick draft demonstrating that we can integrate a safety check following the guidelines at https://discuss.huggingface.co/t/how-to-enable-safety-checker-in-stable-diffusion-2-1-pipeline/31286. While the impact on performance is minimal, I believe it's prudent not to make this check mandatory across our entire pipeline. To implement this feature, we would need to submit a pull request to the diffusers repository to allow enabling or disabling safety checks during inference, if feasible.
Benchmark
Without safety load
pipeline load time: 1.839s pipeline load max GPU memory allocated: 4.833GiB pipeline load max GPU memory reserved: 4.900GiB avg inference time: 0.805s avg inference time per output: 0.268s avg inference max GPU memory allocated: 7.660GiB avg inference max GPU memory reserved: 9.498GiB
With safety load
pipeline load time: 2.408s pipeline load max GPU memory allocated: 5.959GiB pipeline load max GPU memory reserved: 6.043GiB avg inference time: 0.854s avg inference time per output: 0.285s avg inference max GPU memory allocated: 8.787GiB avg inference max GPU memory reserved: 10.643GiB
With safety load and check
pipeline load time: 2.313s pipeline load max GPU memory allocated: 5.965GiB pipeline load max GPU memory reserved: 6.027GiB avg inference time: 0.845s avg inference time per output: 0.282s avg inference max GPU memory allocated: 8.792GiB avg inference max GPU memory reserved: 10.645GiB
Conclusion
Adding the safety check will increase the requiered VRAM and load time a bit but will not influence inference time much.