Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

heavy unet tests - not working yet #327

Merged
merged 56 commits into from
Nov 19, 2024
Merged

heavy unet tests - not working yet #327

merged 56 commits into from
Nov 19, 2024

Conversation

mzouink
Copy link
Member

@mzouink mzouink commented Nov 12, 2024

By running dacapo i hit multiple bugs sor i created a heavier test which test most unet scenario
I think we need to make sure that all the unittest works + we need to have more realistic tests
@pattonw

@mzouink
Copy link
Member Author

mzouink commented Nov 13, 2024

related: #331

mzouink and others added 17 commits November 13, 2024 11:02
The Architecture superclass says that we should be returning coordinates for these properties
This is important because we regularly use these values in arithmetic where we expect to execute +-*/ element wise
…re robust

If we upsample, we probably want to apply a convolution to finetune the outputs rather than simply upsampling which we could do outside of a network. If we assume a kernel of size (3, 3, 3), it fails for 2D networks that process using kernels of size (1, 3, 3). We now just use the last kernel in the kernel size up. This is a bit more robust.
… during training

Otherwise the BatchNorm breaks
This should probably just be switched to use `funlib.perisistence.prepare_ds`
Fixed a couple bugs
1) CNNectome UNet wasn't following the api defined by the Architecture
`ABC`. It was supposed to be returning `Coordinate` class instances for
voxel_size, input_shape etc.
2) CNNectome UNet had logical errors in the definition of kernel_size_up
and down.
3) New tests was expecting multi channel data for the 2D UNet and single
channel for the 3D unet despite always getting single channel data.
4) fix voxel_size attr in test fixtures
There appear to be some python formatting errors in
dba261f. This pull request
uses the [psf/black](https://github.com/psf/black) formatter to fix
these issues.
@pattonw
Copy link
Contributor

pattonw commented Nov 13, 2024

Looks like we both fixed the dims error in different places.
The function signature of input_shape already defined the return type as coordinate, so it is a bug if input_shape isn't already a coordinate. I fixed it inside the cnnectome unet

@pattonw
Copy link
Contributor

pattonw commented Nov 13, 2024

Just want to make a note here for the tests that are still failing they are of the following form:

  1. "The size of tensor a (188) must match the size of tensor b (94) at non-singleton dimension ...". This is caused by the upsampling input data and comparing to gt that is at the same resolution as the input data. We should create a separate dataset fixture for upsampled data.
  2. Assertion Errors. These are caused by us checking for some threshold absolute difference between the weights before and after training a few iterations. Not sure if this assumption always holds.

@mzouink
Copy link
Member Author

mzouink commented Nov 13, 2024

Looks like we both fixed the dims error in different places. The function signature of input_shape already defined the return type as coordinate, so it is a bug if input_shape isn't already a coordinate. I fixed it inside the cnnectome unet

i removed my change

@mzouink
Copy link
Member Author

mzouink commented Nov 13, 2024

  1. "The size of tensor a (188) must match the size of tensor b (94) at non-singleton dimension ...". This is caused by the upsampling input data and comparing to gt that is at the same resolution as the input data. We should create a separate dataset fixture for upsampled data.

i will split the test and i will create a dataset for upsample unet

  1. Assertion Errors. These are caused by us checking for some threshold absolute difference between the weights before and after training a few iterations. Not sure if this assumption always holds.

i agree with you but we need to figure a way to test if the model is training. because i remember we had a problem with distance without adding 4 dimension

@pattonw
Copy link
Contributor

pattonw commented Nov 13, 2024

i agree with you but we need to figure a way to test if the model is training. because i remember we had a problem with distance without adding 4 dimension

I think this was only after validation, we saw a strange jump in the loss. It was caused by putting the model in eval mode for validation, and then not switching it back after completing prediction. I think this went unnoticed because it only makes a difference for things like batchnorm, dropout, and other layers that have different behavior between train and eval. Batchnorm looks like it was added by default so when we were expecting the model to behave the same in train and eval we were getting this strange behavior.

I'm pretty sure thats fixed now and I haven't seen this jump in the minimal tutorial anymore.

We should be testing that our models train, but I think a better way to do this is on full examples, so we should have the minimal tutorial cover more cases (instance/semantic, 2D/3D, multi class/single class, single channel raw/multi channel raw) and test that the loss and validation metrics reach reasonable values

@pattonw
Copy link
Contributor

pattonw commented Nov 13, 2024

I'm thinking something like this:
benchmark action repo
example results

Except instead of tracking time (which we could do), we would also track loss / validation scores for each of our tutorials

This repo looks pretty tailor made to performance benchmarking, but we may be able to hack in the data we want to track, or find a similar/better way

mzouink and others added 3 commits November 15, 2024 11:29
There appear to be some python formatting errors in
97abfa0. This pull request
uses the [psf/black](https://github.com/psf/black) formatter to fix
these issues.
@pattonw
Copy link
Contributor

pattonw commented Nov 18, 2024

Yes, if you have 3D data, you should be using 3D convolutions, even if you want a 2D model, you just use (1, 3, 3) as the kernel shape. Its still a Conv3D class though.
This is convenient because you can then pass a 3D cube through your model at predict time. e.g. a (channelsx200x200x200) can be processed by a model defined with kernel size (1, 3, 3), allowing you to generate predictions with a 2D model in 3D chunks so you can save it to zarr with a reasonable chunk size.
If you made your model use 2D convolutions you would have to do a bunch of axis permutations to turn one of the physical dimensions into a batch dimension and back.

this is necessary for getting compatibility with windows and macos
If the kernel size of the task is set to 3, then this will cause errors if you are trying to use a model with input shape `(1, y, x)`.
The head should just be mapping the penultimate layer embeddings into the appropriate dimensions for the task so it desn't need a size larger than 1.
@pattonw
Copy link
Contributor

pattonw commented Nov 19, 2024

I made a pull request onto this branch that has 2D and 3D data with appropriate models, as well as a psuedo 2D model on 3D data

mzouink and others added 3 commits November 19, 2024 09:38
Simplified the parameterized train test, and added validation. Fixed
bugs that were found
There appear to be some python formatting errors in
85a6dda. This pull request
uses the [psf/black](https://github.com/psf/black) formatter to fix
these issues.
@mzouink
Copy link
Member Author

mzouink commented Nov 19, 2024

thanks @pattonw! tests are successful now. we merge to main ?

@pattonw
Copy link
Contributor

pattonw commented Nov 19, 2024

Yeah I think so

@mzouink mzouink merged commit f5c4b36 into main Nov 19, 2024
4 checks passed
@mzouink mzouink deleted the tests_v0_3_5 branch November 19, 2024 16:07
Copy link

codecov bot commented Nov 19, 2024

Codecov Report

Attention: Patch coverage is 90.41096% with 7 lines in your changes missing coverage. Please review.

Project coverage is 55.14%. Comparing base (fc4a5e6) to head (53217ed).
Report is 57 commits behind head on main.

Files with missing lines Patch % Lines
dacapo/experiments/tasks/one_hot_task.py 60.00% 2 Missing ⚠️
...experiments/tasks/predictors/distance_predictor.py 90.00% 2 Missing ⚠️
.../experiments/datasplits/datasets/dataset_config.py 50.00% 1 Missing ⚠️
dacapo/experiments/datasplits/simple_config.py 0.00% 1 Missing ⚠️
.../experiments/tasks/predictors/one_hot_predictor.py 87.50% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #327      +/-   ##
==========================================
+ Coverage   48.37%   55.14%   +6.76%     
==========================================
  Files         184      184              
  Lines        6470     6514      +44     
==========================================
+ Hits         3130     3592     +462     
+ Misses       3340     2922     -418     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@mzouink mzouink restored the tests_v0_3_5 branch November 22, 2024 21:03
@mzouink mzouink deleted the tests_v0_3_5 branch November 22, 2024 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants