Unable to reproduce the results with the provided model weights and code on LEVIER dataset. #5

Ujjwal238 · 2024-12-26T17:39:39Z

To Reproduce the error

Steps to reproduce the behavior:

added the given weights in the checkpoints/elgcnet_levir/ as best_checkpoint.pt
run eval_cd.py

Error received

  File "/content/elgcnet/eval_cd.py", line 55, in main
    model.eval_models(checkpoint_name=args.checkpoint_name)
  File "/content/elgcnet/models/evaluator.py", line 178, in eval_models
    self._load_checkpoint(checkpoint_name)
  File "/content/elgcnet/models/evaluator.py", line 66, in _load_checkpoint
    checkpoint = torch.load(os.path.join(self.checkpoint_dir, checkpoint_name), map_location='cpu')
  File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 713, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 920, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '\x27'```

The text was updated successfully, but these errors were encountered:

techmn · 2024-12-26T20:14:27Z

Make sure that you are using the correct file path.
Its a common practice to look at the state dict and its keys before loading the model to avoid the error due to key mismatch problem.

Go to evaluator.py and look at the _load_checkpoint function. Update your function to correctly load the model weights.

def _load_checkpoint(self, checkpoint_name='best_ckpt.pt'):

        if os.path.exists(os.path.join(self.checkpoint_dir, checkpoint_name)):
            self.logger.write('loading last checkpoint...\n')
            # load the entire checkpoint
            checkpoint = torch.load(os.path.join(self.checkpoint_dir, checkpoint_name), map_location='cpu')
            
            if isinstance(self.net_G, torch.nn.DataParallel):
                msg = self.net_G.module.load_state_dict(checkpoint) #['model_G_state_dict'])
            else:
                msg = self.net_G.load_state_dict(checkpoint) #['model_G_state_dict'])
            print(msg)
            
            self.net_G.to(self.device)
            self.logger.write('\n')

Hope it works!

Ujjwal238 · 2024-12-27T13:07:42Z

i tried doing the mentioned changed but still received the same error as before. Please look into it.

techmn · 2024-12-27T14:39:59Z

There might be problem with with your downloaded files or the libraries you are using.
I am not getting any error. Here is the log

checkpoint_name: elgcnet_levir_ckpt.pt checkpoint_dir: ./elgcnet_levir
<All keys matched successfully>

Begin evaluation...
Is_training: False. [1,2048],  running_mf1: 0.96269
Is_training: False. [101,2048],  running_mf1: 0.50000
Is_training: False. [201,2048],  running_mf1: 0.97121
Is_training: False. [301,2048],  running_mf1: 0.99011
Is_training: False. [401,2048],  running_mf1: 0.50000
Is_training: False. [501,2048],  running_mf1: 0.50000
Is_training: False. [601,2048],  running_mf1: 0.50000
Is_training: False. [701,2048],  running_mf1: 0.88616
Is_training: False. [801,2048],  running_mf1: 0.92438
Is_training: False. [901,2048],  running_mf1: 0.89096
Is_training: False. [1001,2048],  running_mf1: 0.97063
Is_training: False. [1101,2048],  running_mf1: 0.92529
Is_training: False. [1201,2048],  running_mf1: 0.49982
Is_training: False. [1301,2048],  running_mf1: 0.88886
Is_training: False. [1401,2048],  running_mf1: 0.50000
Is_training: False. [1501,2048],  running_mf1: 0.97076
Is_training: False. [1601,2048],  running_mf1: 0.97004
Is_training: False. [1701,2048],  running_mf1: 0.49924
Is_training: False. [1801,2048],  running_mf1: 0.50000
Is_training: False. [1901,2048],  running_mf1: 0.49871
Is_training: False. [2001,2048],  running_mf1: 0.98605
acc: 0.99118 miou: 0.91450 mf1: 0.95368 iou_0: 0.99076 iou_1: 0.83825 F1_0: 0.99536 F1_1: 0.91201

You can get help from here.

Ujjwal238 · 2024-12-28T06:14:55Z

I went through the link you provided. I speculate that The .pt file that i have received after training is of 127 MB while the .pt file uploaded is mere 44MB , i believe there's some mismatch somewhere.

Ujjwal238 · 2025-01-15T14:14:29Z

could you please share the precision and recall of the pre and post change class well?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to reproduce the results with the provided model weights and code on LEVIER dataset. #5

Unable to reproduce the results with the provided model weights and code on LEVIER dataset. #5

Ujjwal238 commented Dec 26, 2024

techmn commented Dec 26, 2024

Ujjwal238 commented Dec 27, 2024

techmn commented Dec 27, 2024

Ujjwal238 commented Dec 28, 2024

Ujjwal238 commented Jan 15, 2025

Unable to reproduce the results with the provided model weights and code on LEVIER dataset. #5

Unable to reproduce the results with the provided model weights and code on LEVIER dataset. #5

Comments

Ujjwal238 commented Dec 26, 2024

To Reproduce the error

Error received

techmn commented Dec 26, 2024

Ujjwal238 commented Dec 27, 2024

techmn commented Dec 27, 2024

Ujjwal238 commented Dec 28, 2024

Ujjwal238 commented Jan 15, 2025