From fa02e774b0ddfd056202e47d17853071ec891c91 Mon Sep 17 00:00:00 2001 From: Jaret Burkett Date: Sat, 10 Aug 2024 15:08:05 -0600 Subject: [PATCH] Added info about datset --- README.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/README.md b/README.md index 259d47f4..86115c01 100644 --- a/README.md +++ b/README.md @@ -79,6 +79,19 @@ Coming very soon. Getting base out then will have a notebook that makes all that --- +## Dataset Preparation + +Datasets generally need to be a folder containing images and associated text files. Currently, the only supported +formats are jpg, jpeg, and png. Webp currently has issues. The text files should be named the same as the images +but with a `.txt` extension. For example `image22.jpg` and `image2.txt`. The text file should contain only the caption. +You can add the word `[trigger]` in the caption file and if you have `trigger_word` in your config, it will be automatically +replaced. + +Images are never upscaled but they are downscaled and placed in buckets for batching. **You do not need to crop/resize your images**. +The loader will automatically resize them and can handle varying aspect ratios. + +--- + ## EVERYTHING BELOW THIS LINE IS OUTDATED It may still work like that, but I have not tested it in a while.