[Question] Distributed Training #1

zaptrem · 2024-11-05T21:41:50Z

Does this support distributed training (e.g., DDP/FSDP)? Thanks for sharing!

ClashLuke · 2024-11-06T00:43:54Z

Hey, thank you for your interest!
Everything should support DDP, but it doesn't have any autosharding of optimizer states and computation.

PaLM-SFAdamW might support FSDP, but SOAP won't distribute well.

LLouice · 2024-11-09T15:59:22Z

Hi @ClashLuke, could you please include a section in the README regarding DDP/DeepSpeed/FSDP compatibility? This addition would greatly assist users in applying these techniques in their practical projects!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Distributed Training #1

[Question] Distributed Training #1

zaptrem commented Nov 5, 2024

ClashLuke commented Nov 6, 2024

LLouice commented Nov 9, 2024

[Question] Distributed Training #1

[Question] Distributed Training #1

Comments

zaptrem commented Nov 5, 2024

ClashLuke commented Nov 6, 2024

LLouice commented Nov 9, 2024