Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using a custom dataset with 512x512 shape produces an error #2

Open
DenisN03 opened this issue Jul 18, 2022 · 3 comments
Open

Using a custom dataset with 512x512 shape produces an error #2

DenisN03 opened this issue Jul 18, 2022 · 3 comments

Comments

@DenisN03
Copy link

DenisN03 commented Jul 18, 2022

Hello.
I want to run training on my dataset at 512x512 resolution. The training phase passes without problems, but an error occurs during validation:

File "/home/user/work/hila/mmseg/models/backbones/hila.py", line 111, in forward
    Xbp = Xbp.reshape(B, C, patch_size[0] ** 2, Ht * Wt).permute(0, 3, 2, 1)
RuntimeError: shape '[1, 128, 16, 920]' is invalid for input of size 1802240

Could you tell me, please, what could be the problem?
My config file:
hila.b1.1024x1024.city.160k_S234_my.txt

@1234gary
Copy link
Collaborator

Hey Denis,

Looking at your config I can see two things that might cause issues.

  1. Around line 34 you should use 'whole' testing instead of sliding window testing
    test_cfg=dict(mode='whole'))

  2. Around line 115 and 139 your resize should have keep_ratio=False, otherwise the dimensions will keep the original image ratio.
    dict(type='Resize', keep_ratio=False),

I've ran your config and wasn't able to get the error you got. My guess is some of these things are happening:

  1. The validation inputs might not be loaded with the correct size
  2. Ht and Wt aren't matching the actual dimensions of the input (perhaps the input isn't being downsampled properly?)

Could you try checking those values?

Those dimension numbers look quite strange to me, as I would expect 512x512 to be downsampled maybe 16x in layer 3 to be 32x32 = 1024. Ht * Wt = 920 in your case isn't a square number as you would expect, and 1802240 corresponds to a [1, 128, 16, 880] dimension.

@DenisN03
Copy link
Author

DenisN03 commented Jul 19, 2022

Thanks for your help Gary.

I changed the keep_ratio setting to False and that did the trick.

Now I am facing another problem. After testing is complete, an error occurs:

/usr/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 20 leaked semaphores to clean up at shutdown
  len(cache))

During the testing phase, RAM consumption increases very much.
At the end of the test swap overflow and the script stops.
Снимок экрана от 2022-07-19 08-39-51
Снимок экрана от 2022-07-19 08-38-59
Снимок экрана от 2022-07-19 08-40-55

@1234gary
Copy link
Collaborator

Hey Denis,

Are you able to get the test results? What is the exact line that you are running for evaluation? It's hard for me to diagnose what the issue might be without those.

I know that F-Score can be very heavy on the RAM and for that reason, if you are running F-Score you might want to run it in batches and average the scores together. This will result in the same results as if you ran the F-Score on all images at once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants