Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about Memory #14

Open
9p15p opened this issue Apr 2, 2020 · 4 comments
Open

Some questions about Memory #14

9p15p opened this issue Apr 2, 2020 · 4 comments

Comments

@9p15p
Copy link

9p15p commented Apr 2, 2020

Hi, sir! Thank you for your fine work, but I still have some question.

  1. Should we make train_code's memory part(Sth like memory part in eval_code) under "with torch.no_grad"?Does memory require grad?
@seoungwugoh
Copy link
Owner

@9p15p Yes, memory also requires gradients. In specific, we learn how to encode a memory from [frame, mask]. Naturally, every feed-forward operation is done without torch.no_grad during training.

@9p15p
Copy link
Author

9p15p commented Apr 23, 2020

Thank you for your reply.
I have another question: how should we backpropagate our loss ?IOW, where should we put our "loss.backward"? I have 3 strategies:

  1. calculate loss and put "loss.backward" in every object every frame.(in experiment, we will use it with 'retain_graph=True' twice for the last two frames and after we calculate two frames' loss we use another one without 'retain_graph=True'.

  2. calculate loss in every object every frame but only put "loss.backward" after we calculate all frames' loss.

  3. only calculate loss for the final(third) mask, and ignore the middle(second) mask, and only use one "loss.backward()" in the end.

##inf: I use their different colors to pick out different objects in all pictures. I train the model by only using a object with 3 frames and 3masks at a time. ##

Maybe, none of these three strategies is right.
Maybe, my "inf" is improper, we should train all object in the same time.

looking forward to your advice!~
thank you!

@seoungwugoh
Copy link
Owner

What I did is option 2.
We sum-up all the losses and call backward and step at the end of the iteration.

@9p15p
Copy link
Author

9p15p commented May 4, 2020

Thank you, Sir.
Although,my best reimplemention is still not satisfactory.
But in my own experiment, calling loss back every frame has a better performance and Converges quicker. (use 'retain_graph=True' for the second frame's loss.).I will try again.

best wishes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants