Some tensors share memory #9

martinjingyu · 2024-11-29T21:20:36Z

I met the same question as #6, and I fixed!

martinjingyu · 2024-11-29T21:25:04Z

Open the file "curiosity_redteam/custom_trlx/trlx/trainer/accelerate_base_trainer.py"
Find the method: def save(self, directory: Optional[str] = None, **kwargs):

def save(self, directory: Optional[str] = None, **kwargs):
        """Creates a checkpoint of the optimizer, scheduler and model"""
        dst_dir = directory or self.config.train.checkpoint_dir

        # Manually copy weights to detach shared memory.
        with torch.no_grad():
            self.model.base_model.base_model.model.lm_head.weight = torch.nn.Parameter(self.model.base_model.base_model.model.transformer.wte.weight.clone())
        self.accelerator.save_state(dst_dir, **kwargs)

        if self.config.model.peft_config is not None and self.accelerator.is_main_process:
            # Remove "pytorch_model.bin" because it contains more than necessary,
            # let save_pretrained recreate it with just the value heads.
            model_file = os.path.join(dst_dir, "pytorch_model.bin")
            if os.path.exists(model_file):
                os.remove(model_file)
            self.accelerator.unwrap_model(self.model).save_pretrained(dst_dir)

martinjingyu mentioned this issue Nov 29, 2024

Same question with Issue #6 #8

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some tensors share memory #9

Some tensors share memory #9

martinjingyu commented Nov 29, 2024

martinjingyu commented Nov 29, 2024

Some tensors share memory #9

Some tensors share memory #9

Comments

martinjingyu commented Nov 29, 2024

martinjingyu commented Nov 29, 2024