Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quesiton about training and (possibly) finetuning the model #6

Open
costrice opened this issue Dec 18, 2024 · 5 comments
Open

Quesiton about training and (possibly) finetuning the model #6

costrice opened this issue Dec 18, 2024 · 5 comments

Comments

@costrice
Copy link

Hi, thanks for your impressive work on image generation!

I anticipate using it as a generative backbone of my future work, so I am a bit curious about how many GPU resources are needed to train the model from scratch. More importantly, is it possible to finetune the model (like we commonly did on diffusion-based models like SD) using fewer GPU resources? Could you please provide some information? Thanks very much!

@JeyesHan
Copy link
Collaborator

JeyesHan commented Dec 18, 2024

Training from scratch is very expensive especially for the 1024x1024 resolution. We highly recommand you finetune Infinity. According to our full-params fine-tuning test with 4 GPUs, an iteration takes around 6s and 50GB vRAM per GPU, where global batch size=16, resolution=1024x1024. You can estimate the GPU resources for your fine-tuning task.

@wxxhaoshuai
Copy link

hi, I would like to know how many computing resources are required to train the 125M model from scratch and how many are required to finetune?

@JeyesHan
Copy link
Collaborator

JeyesHan commented Dec 26, 2024

@wxxhaoshuai
Apart from the model size, computing resources also depend on the data size and the target resolution. I think 16 gpus (A100 or H100) are enough for training 125M model from scratch under 256x256 resolution.

@wxxhaoshuai
Copy link

Will you release the smaller checkpoint?such as 125M or 1B.

@JeyesHan
Copy link
Collaborator

JeyesHan commented Dec 26, 2024

@wxxhaoshuai These small models are trained with a small subset of the whole dataset and used for demonstrating the scaling capability of Infinity. They are not full trained with abundant data, resolutions , and iterations. Therefore, we have no plan to release samller models for Infinity😭. We plan to release Infinity-20B.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants