Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version of densenet #391

Closed
haofanwang opened this issue Jan 17, 2018 · 8 comments
Closed

Version of densenet #391

haofanwang opened this issue Jan 17, 2018 · 8 comments

Comments

@haofanwang
Copy link

May I ask what's the version of densenet in torchivision.models ? The original or efficient_densenet_pytorch, as the original is memory hungry. If it's the original version, would pytorch team consider add the efficient version into model zoo ?

@soumith
Copy link
Member

soumith commented Jan 17, 2018

we plan to add a memory efficient version soon via pytorch/pytorch#4594

@gpleiss
Copy link
Contributor

gpleiss commented Apr 26, 2018

Now that PyTorch 0.4 is officially out, I'm making the efficient_densenet_pytorch code use the checkpointing feature. I can make a PR to this repo once I get it working!

@fmassa
Copy link
Member

fmassa commented Apr 26, 2018

@gpleiss that would be a nice addition! Maybe by specifying a constructor argument that dispatches to checkpoint? The only thing we need to keep in mind is that checkpoint currently requires the input to have requires_grad=True, which is suboptimal in cases where we don't checkpoint.

@gpleiss
Copy link
Contributor

gpleiss commented Apr 26, 2018

@fmassa I'm thinking something like this?

# prev_features = [feat_1, feat_2, ...]
# ...
 if self.efficient and any(prev_features.requires_grad):
    bottleneck_output = checkpoint(bn_function, *prev_features)
 else:
    bottleneck_output = bn_function(*prev_features)
# ...

And the self.efficient flag is something that can be passed in by the user.

@fmassa
Copy link
Member

fmassa commented Apr 26, 2018

That sounds good!

@gpleiss
Copy link
Contributor

gpleiss commented Apr 26, 2018

I'm profiling it now for my repo, and it seems to be good! Should have a PR ready tomorrow.

@gpleiss
Copy link
Contributor

gpleiss commented May 23, 2018

So sorry for the late response this...

The efficient densenet code seems to work great on a single GPU. However, on multiple GPUs with nn.DataParallel, @wandering007 points out that the checkpointing feature is quite slow (see gpleiss/efficient_densenet_pytorch#36). I think this is because checkpointing requires some sort of inter-GPU synchronization.

I'm opening up an issue in PyTorch about this. I'm holding off on a PR for now.

@fmassa
Copy link
Member

fmassa commented May 24, 2018

Sounds good, thanks @gpleiss !

@fmassa fmassa closed this as completed Feb 21, 2021
rajveerb pushed a commit to rajveerb/vision that referenced this issue Nov 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants