Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何同时训练两个模型? #6028

Closed
1 task done
wangqiang9 opened this issue Aug 23, 2024 · 4 comments
Closed
1 task done

如何同时训练两个模型? #6028

wangqiang9 opened this issue Aug 23, 2024 · 4 comments
Labels
question Further information is requested

Comments

@wangqiang9
Copy link

Is there an existing issue for this bug?

  • I have searched the existing issues

🐛 Describe the bug

在官方文档中给出了训练一个model的例子:

colossalai.launch(...)
plugin = GeminiPlugin(...)
booster = Booster(precision='fp16', plugin=plugin)

model = GPT2()
optimizer = HybridAdam(model.parameters())
dataloader = plugin.prepare_dataloader(train_dataset, batch_size=8)
lr_scheduler = LinearWarmupScheduler()
criterion = GPTLMLoss()

model, optimizer, criterion, dataloader, lr_scheduler = booster.boost(model, optimizer, criterion, dataloader, lr_scheduler)

for epoch in range(max_epochs):
    for input_ids, attention_mask in dataloader:
        outputs = model(input_ids.cuda(), attention_mask.cuda())
        loss = criterion(outputs.logits, input_ids)
        booster.backward(loss, optimizer)
        optimizer.step()
        lr_scheduler.step()
        optimizer.zero_grad()

如果在我的训练中,有两个模型model1和model2都需要被训练,应该如何使用训练呢?

Environment

No response

@wangqiang9 wangqiang9 added the bug Something isn't working label Aug 23, 2024
@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Title: How to train two models at the same time?

@wangqiang9
Copy link
Author

wangqiang9 commented Aug 23, 2024

请教一下,能否写成这样:

model1, _, _, _, _ = booster.boost(model=model1)
model2, optimizer, criterion, dataloader, lr_scheduler = booster.boost(model2, optimizer, criterion, dataloader, lr_scheduler)

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Please tell me if you can write it like this:

model1, _ = booster.boost(model1)
model2, optimizer, criterion, dataloader, lr_scheduler = booster.boost(model2, optimizer, criterion, dataloader, lr_scheduler)

@Edenzzzz Edenzzzz added question Further information is requested and removed bug Something isn't working labels Aug 23, 2024
@Edenzzzz
Copy link
Contributor

Edenzzzz commented Aug 23, 2024

Looks like the right way if you intend to freeze the first model. Is this throwing any error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants