inconsistency with the original paper #24

duducheng · 2019-01-24T08:32:44Z

Hello, thanks for your nice code!

I found there were 2 inconsistencies with the original paper, and they are very easy to fix indeed:

the gamma: in the original paper, all the block_mask are complete squares (or cubes), sinces its mask are only sampled on the central parts.
in the paper, it said the channels use different masks, while in your implement they use the same.

I just figure them out, actually I do not know whether they are effective tricks, there are insufficient details discussed in the paper :)

The text was updated successfully, but these errors were encountered:

miguelvr · 2019-01-24T10:46:25Z

The gamma issue is a minor thing but I can have a look at it.

The channels share the same mask in the paper.

duducheng · 2019-01-24T10:55:04Z

“We experimented with a shared DropBlock mask across different feature channels or each feature
channel has its DropBlock mask. Algorithm 1 corresponds to the latter, which tends to work better in
our experiments.” (page 2 bottom line)

miguelvr · 2019-01-24T14:52:32Z

Sure, that is easily fixable

Expect it soon

Edit: you can also do a PR if you want

huyvnphan · 2019-07-12T13:56:35Z

Hi,
Any updates on this?
Best

miguelvr · 2019-07-15T08:13:25Z

Hi,
Any updates on this?
Best

I haven't had much free time to deal with this, but I will review and accept merge requests

JarvisKevin · 2019-08-10T01:30:16Z

I also found some difference between paper and code.

Eliza-and-black · 2021-12-04T08:23:37Z

To solve this issue, you could have a look at this folk(only for DropBlock2D)

miguelvr · 2021-12-04T13:59:55Z

To solve this issue, you could have a look at this folk(only for DropBlock2D)

I would encourage you to do a pull request

JohnDLee · 2022-01-25T04:05:42Z

If you do look at the code linked above, note that mask_center is not initialized on the device, so the part where nn.ZeroPad2d is called will by default run on the CPU. For me, since I was training on a GPU, this slowed down a single forward call (of my model which uses many Dropblocks) from .15 seconds to 3 seconds.

miguelvr added the enhancement New feature or request label Feb 3, 2019

miguelvr mentioned this issue Feb 3, 2019

same results as traditional dropout if block_size=1 #26

Open

miguelvr added question Further information is requested enhancement New feature or request v0.4 and removed enhancement New feature or request question Further information is requested labels Feb 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inconsistency with the original paper #24

inconsistency with the original paper #24

duducheng commented Jan 24, 2019

miguelvr commented Jan 24, 2019

duducheng commented Jan 24, 2019

miguelvr commented Jan 24, 2019 •

edited

Loading

huyvnphan commented Jul 12, 2019

miguelvr commented Jul 15, 2019

JarvisKevin commented Aug 10, 2019

Eliza-and-black commented Dec 4, 2021 •

edited

Loading

miguelvr commented Dec 4, 2021

JohnDLee commented Jan 25, 2022 •

edited

Loading

inconsistency with the original paper #24

inconsistency with the original paper #24

Comments

duducheng commented Jan 24, 2019

miguelvr commented Jan 24, 2019

duducheng commented Jan 24, 2019

miguelvr commented Jan 24, 2019 • edited Loading

huyvnphan commented Jul 12, 2019

miguelvr commented Jul 15, 2019

JarvisKevin commented Aug 10, 2019

Eliza-and-black commented Dec 4, 2021 • edited Loading

miguelvr commented Dec 4, 2021

JohnDLee commented Jan 25, 2022 • edited Loading

miguelvr commented Jan 24, 2019 •

edited

Loading

Eliza-and-black commented Dec 4, 2021 •

edited

Loading

JohnDLee commented Jan 25, 2022 •

edited

Loading