Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a new image_size parameter in train_dalle and generate #310

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rom1504
Copy link
Contributor

@rom1504 rom1504 commented Jun 16, 2021

VAE models can be use with patches of any size.
For example a model trained on 16x16 patches can still be used on 32x32 patches
that increase the seq length from 256 to 1024 in dalle

This is still a draft, as I think we should not store the image_size in the vae, but rather in the dalle. Indeed dalle model need the sequence length to be of fixed size and fixed resolution, but the vae model works for any power of 2 resolution. I'll try later to do this change.

This work as-is if anyone want to experiment with it a bit.

@rom1504
Copy link
Contributor Author

rom1504 commented Jun 20, 2021

I'm thinking it should actually be possible to also make the patch_size configurable (as done in https://github.com/lucidrains/x-transformers/blob/a11b178573d2941c98a2c6d5b3a15fd9c97d4884/x_transformers/x_transformers.py#L644 )

VAE models can be use with patches of any size.
For example a model trained on 16x16 patches can still be used on 32x32 patches
that increase the seq length from 256 to 1024 in dalle
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants