Trained for 2 epochs on ~570k "illustrations" using a couple CogView ideas #287
afiaka87
started this conversation in
Show and tell
Replies: 1 comment
-
Thanks for providing this info. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Perhaps the most important finding here - all was done in less than a single day on my RTX 2070 with a mere 8 GiB of VRAM thanks (basically) entirely to the use of DeepSpeed's gradient accumulation and automatic mixed precision.
This is largely a reposting of an issue I made: #266 (comment)
I've done a run using
--loss_img_weight 1
and setting the presently hiddenstable
parameter toTrue
in the DALLE initialization.Here is a W&B report. I'm not tracking text and img loss separately although the average loss seems to converge much quicker; I assume that has something to do with the weighting exploring a "different loss curve". Happy to be corrected.
https://wandb.ai/dalle-pytorch-replicate/illustrations_imagenetvqgan/reports/Snapshot-Jun-6-2021-12-43pm--Vmlldzo3NTYxMjE?accessToken=hhov3b0wsf56tts63wx4qijkl4pnpiogizoh6a32bdctvngy5rvwtygjqpfyl1uj
@lucidrains @robvanvolt @rom1504 @gabriel_syme @janEbert @mehdidc
Here is the byte pair encoding I used. Vocab size of 8192 covering 99.999% of all unique characters in about 6 million captions from conceptual captions. Perhaps overkill for these illustrations actually - which have a more limited vocabulary. Created with youtokentome.
https://www.dropbox.com/s/ay01p8zegfwof8t/variety.bpe
Here is a checkpoint from the most recent iteration (still training). Decided to name the checkpoint "royalty free" as the dataset largely consists of 570,000 royalty free illustrations from the conceptual captions dataset.
Inspired by OpenAI:
Inspired by CogView
https://www.dropbox.com/s/drpkcmr6b3zbftm/royalty_free.pt
If you'd like to generate from this checkpoint and have access to an Nvidia GPU with cuda support:
Beta Was this translation helpful? Give feedback.
All reactions