phi-1_5 Training memory usage question #658

Layoric · 2023-10-01T00:45:11Z

Layoric
Oct 1, 2023

Just wondering if others are seeing the same memory usage when using the phi finetuning example yml? I would usually expect 4-5x times the number of parameters for fine tuning, however I am hitting memory issues on A4000s even when I try to use device_map auto across multiple of the same GPU (up to 3). I'm sure there is a good reason for this that I am missing, but just wondering if others see the same or could share their experience fine tuning the phi 1.5 model? I'm using the docker compose environment to try to minimize variables, and I'm using the example phi-ft.yml file running the command accelerate launch --num_processes 1 scripts/finetune.py examples/phi/phi-ft.yml. Any insights would be appreciated!

Answered by mattma1970

Oct 19, 2023

I've been looking into memory usage when finetuning Phi1.5 QLoRA. Memory usage scale super-linearly with sequence length and batch size. I can finetune it with sequence length 1500 and micro-batchsize of 1 and I hit around 12.6G vram usage on an RTX 4090. A batchsize of 2 uses 22.6GB.

View full answer

mattma1970 · 2023-10-19T09:24:16Z

mattma1970
Oct 19, 2023

I've been looking into memory usage when finetuning Phi1.5 QLoRA. Memory usage scale super-linearly with sequence length and batch size. I can finetune it with sequence length 1500 and micro-batchsize of 1 and I hit around 12.6G vram usage on an RTX 4090. A batchsize of 2 uses 22.6GB.

1 reply

hungphongtran-pixta Oct 30, 2023

I have another config of sequence length=1024 and micro-batchsize = 1, however I trained "completion" task and loss on the input as well and still encountered OOM as well. It only worked after I reduced sequence length to 512.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

phi-1_5 Training memory usage question #658

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

phi-1_5 Training memory usage question #658

Layoric Oct 1, 2023

Replies: 1 comment · 1 reply

mattma1970 Oct 19, 2023

hungphongtran-pixta Oct 30, 2023

Layoric
Oct 1, 2023

Replies: 1 comment 1 reply

mattma1970
Oct 19, 2023